53. (Amended) A viral vector comprising a nucleic acid of claim 1 . 

54. (Amended) A fusion polypeptide comprising an amino acid sequence 
encoded by the nucleic acid sequence of claim 1 fused to a heterologous amino acid 



sequence. 



70. (Amended) A nucleic acid according to claim 1 attached to a solid support. 




72. (New) The process of claim 8 further comprising isolating the 

polypeptide from the culture. 



REMARKS 

I. Explanation of Amendments 

A marked-up version of the changes made to the claims and specification can be 
found in Appendix A hereto. As a convenience to the examiner, the applicants have set forth 
the pending claims in Appendix B as they should appear after entry of the foregoing 
amendment. 

In paragraph 2 of the Office action, the examiner objected to the title of the invention. 
Specifically, the examiner alleged the title was not descriptive. Applicants have amended the 
title and are now clearly indicative of the invention to which the claimed are directed. 

In paragraph 3 of the Office action, the examiner objected to claims 1-8, 10, 1 1, 51- 
55, 70, and 71 for reciting non-elected sequences SEQ ID NO: 3 and SEQ ID NO: 4. 
Amendments removing the recitation to the non-elected SEQ ID NOS: 3 and 4 in claims 1, 4, 
51, 53, and 70, and the cancellation of claims 2, 3, and 71 obviate this objection. 

In paragraph 3 of the Office action, the examiner also objected to claims 54 and 55 for 
reciting non-elected claims 13, 14, and 15. The applicants have amended claims 54 and 55 to 
address the objection. Support for the nucleic acids recited in claims 54 and 55 can be found 
in the claims as originally filed and in the specification, from page 35, line 1 1, to page 38, 
line 10. 

Amended claims 1, 4, 51, 52, and 53 have been amended to remove the unnecessary 
word "molecule", an amendment which is not narrowing and which is not related to 
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patentability. Claim 1 was further amended in part (b) to recite a nucleotide sequence 
encoding a polypeptide comprising an amino acid sequence as set forth in SEQ ID NO:2. 

Amended claim 8 recites a process of producing a CD20/IgE-receptor like 
polypeptide comprising culturing the host cell of claim 5 under suitable conditions to express 
a CD20/IgE-receptor like polypeptide encoded by the nucleic acid. This amendment finds 
support in the parent claims from which claim 8 depends, and is not in response to any 
rejection. 

Amended claim 10 recites a process of claim 8 wherein the vector further comprises 
a heterologous promoter operatively linked to the nucleotide sequence encoding the 
CD20/IgE-receptor like polypeptide. This amendment states the subject matter of the claim 
more succinctly without narrowing the scope of the claim. Support for the phrase 
"heterologous promoter" is found throughout the specification, for example, on page 52, lines 
2-9. 

Claim 70 has been amended to recite a term with verbatim antecedent basis and not to 
narrow the claim or for any reason related to patentability. 

New claim 72 recites the process of claim 8 further comprising isolating the 
polypeptide from the culture. Support for the isolation of the polypeptide can be found in 
originally filed claim 8 and in the specification on pages 58 and 59. 

Claims 9, 12-50, and 56-69 were cancelled because these claims were withdrawn 
from consideration as being drawn to a non-elected invention, and not for any reason related 
to patentability. The applicants do not intend by these or any amendments to abandon subject 
matter of the claims as originally filed or later presented, and reserve the right to pursue such 
subject matter in continuing applications. 

II. The Rejections under 35 U.S.C. § 101 and 35 U.S.C. § 112, First Paragraph, 
Should Be Withdrawn 

A. The Claims Are Directed to Statutory Subject Matter 

In paragraph 4 of the Office action, the Patent Office rejected claims 5-7 under 35 
USC § 101 for allegedly being directed to non-statutory subject matter. The examiner asserts 
that the recited host cells in claims 5-7 encompass human cells, fetuses and embryos as well 



as non-human cells including chimeric animals, germ cells, fertilized eggs, fetal tissues and 
organs. 

In its rejection, the Patent Office seems to be stating that subject matter is non- 
statutory if it can be read to "embrace" a cell in a human or animal. However, no authority is 
cited for this proposition. In fact, the Patent Office has granted numerous patents on subject 
matter that is used in humans. For example, pharmaceuticals and foodstuffs are patentable 
subject matter, even though they are administered to or ingested by humans, or animals. 
Medical devices, such as artificial body parts, implantable stents and prostheses, and the like, 
are patentable subject matter, even though they are implanted transiently or permanently in a 
human or animal. These are just some examples of statutory subject matter that can 
transiently or permanently be introduced to a human. 

The foregoing examples can be distinguished from a patent claim to a human per se. 
However, there is no basis for interpreting claims 5-7 as reading on a human per se, just as 
there is no basis for interpreting medical device claims as reading on the patients into whom 
the device is implanted. For these reasons, the rejection should be withdrawn. 

In the Office action, the Patent Office also suggested that amending the claims to 
recite "non-human" would obviate the rejection. This suggestion has not been adopted 
because the applicants contemplate human host cells, and there is no statutory prohibition 
against them. For example, the application describes using human cell lines, such as human 
embryonic kidney cells (HEK) 293 or 293T cells (page 56, lines 15 and 16) for recombinant 
production of the polypeptide. 

The Patent Office also suggested that amendment to reflect "hand-of-man" would be 
appropriate. However, claim 5 already requires the "hand-of-man" to make a vector 
comprising the nucleic acid and to introduce the vector into a host cell, so no further 
amendment is believed to be necessary. 

B. The Claims Are Directed to a Credible, Specific and Substantial Utility 

In paragraph 5a of the Office action, the Patent Office rejected claims 1-8, 10, 11, 51- 
55, 70, and 71 under 35 U.S.C. § 101 for assertedly lacking a credible, specific, and 
substantial or well-established utility. Specifically, the examiner alleged that the 
specification does not reasonable provide enablement to obtain the functional data needed to 
permit one to produce a nucleic acid with the functional (a defined biological activity) 
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requirements of the claims. Cancellation of claims 2, 3, 1 1 and 71 by amendment herein has 
rendered moot the rejections as applied to those claims. The applicants respectfully traverse 
these rejections as applied to claims 1, 4-8, 10, 51-55, and 70. 

The Patent Office's basis for rejection focused on whether or not the application 
contained a sufficient teaching of protein "activity." Not only is this allegation incorrect, but 
it is also too narrowly focused, because "protein activity" is not the only specific, substantial, 
and credible utility for the biological molecules claimed. For example, the polynucleotide 
SEQ ED NO: 1 can be used as a marker for testicular cells specifically expressing this nucleic 
acid molecule. The specification contemplates using SEQ ID NO: 1 as a tissue specific 
probe, e.g., page 41, lines 30-32. In Example 3, at page 112, lines 13-18, the application 
teaches that the DNA sequence from clone agp-96614-al (SEQ ID NO: 1) was used as a 
probe for a Northern tissue expression experiment and the results show a specific 
(predominant) tissue expression from the testes. 1 Using SEQ ID NO:l as a probe is a 
"credible" utility according to page 5 of the Revised Interim Utility Guidelines Training 
Materials. 2 In addition, this utility is "substantial" because a testes-specific probe, like any 
tissue specific probe, can be used in a variety of real world contexts to identify the tissue type 
of cells. For example, a pathologist can use tissue specific probes to identify cell types in a 
biological sample, which maybe helpful, e.g., to screen for a tumor metastases. Thus, 
polynucleotides of the invention have a credible, specific and substantial utility as a tissue- 
specific probe. 

By way of analogy, the encoded protein, which is a 4-transmembrane protein 
expressed on cell surfaces, is useful as a tissue specific marker as well, e.g., using antibodies 
for detection. 

In addition, the polynucleotide SEQ ID NO: 1 is useful to map the chromosomal 
location of CD20/IgE receptor-like gene. The specification teaches to use SEQ ID NO: 1 as a 
chromosomal marker for itself and related genes on the chromosome on page 105, lines 28- 

1 This expression profile was independently confirmed in Hulett et al. (Biochem and Biophys. 
Research Communications 280:314-319 (2001)) (See Appendix C). Compare page 377, 
second column and Figure 3 of Hulett et al. with Example 3 of the specification. 

2 "A credible utility is assessed from the standpoint of whether a person of ordinary skill in 
the art would accept that the recited or disclosed invention is currently available for such use. 
For example, no perpetual motion would be considered to be currently available. However, 
nucleic acids could be used as probes, chromosome markers, or forensic or diagnostic 
markers." Page 5, lines 1 1-18 of the Revised Interim Utility Guidelines Training Materials. 
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33. The use of SEQ ID NO: 1 as a chromosomal marker is a "credible utility" according to 
the Revised Interim Utility Guidelines Training Materials on page 5 (see footnote 2). SEQ 
ED NO: 1 encodes for a cell surface receptor and is the member of the MSA4 family 
(membrane-spanning 4-domain family, subfamily A) whose members include CD20, FceRip 
(IgE), and HTm4. [See Liang et aL, Genomics 72:1 19-127, 120 (2001) (See Appendix D) 
and specification at page 11, lines 5-9, and Figure 3.] This utility as a chromosomal marker 
has been confirmed/validated in the literature which has reported that SEQ ID NO: 1 and the 
related CD20, IgE, and HTm4 genes are clustered within the same region of chromosome 1 1 
(1 lq 12-13). (See Hulett et al. 9 at page 378). Chromosomal aberrations within this region of 
chromosome 1 1 are known to be linked to non-Hodgkin ! s lymphoma (McLaughlin et aL 9 
1998) and pathogenesis of various allergic diseases (Adra et aL, 1999) (See Appendix E). 
Therefore, SEQ ID NO: 1 has a real world use as a chromosomal marker for chromosome 
1 lq 12-13. Chromosomal markers can also be used as a probe for detecting various forms of 
aneuploidy, genetic disorders characterized by an abnormal number of chromosomes. Thus, 
the polynucleotides of the invention also have a credible, specific, and substantial utility as a 
chromosomal marker. 

For reasons set forth above, the nucleic acid sequence of SEQ ID NO: 1 has a 
credible, specific and substantial utility as a probe and as a chromosomal marker. 
Accordingly, the applicants respectfully requests that the rejection of the claims under 35 
U.S.C. §101 be withdrawn. 

III. The Rejections Under 35 U.S.C. §112, First Paragraph, Should Be Withdrawn 
A. The Claims Are Enabled by the Specification 

In paragraph 5b of the office action, the examiner rejected claims 1-8, 10, 11, 51-55, 
70, and 71 under 35 U.S.C. §1 12, first paragraph, for allegedly lacking a substantial asserted 
utility, such that the specification allegedly did not enable one of skill in the art to use the 
claimed invention. As explained in detail in the preceding section, the application teaches 
one of ordinary skill how to use polynucleotides of the invention as a probe or marker, for 
example. 
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The examiner further alleged that the specification does not reasonably provide 
enablement to obtain the functional and structural data needed to permit one to produce a 
nucleic acid which meets both the structural (at least 70% identical to the polypeptide of SEQ 
ID NO: 2) and functional (biological activity) requirements of the claims. These rejections 
have been rendered moot by this amendment. 

The applicants have amended claim 1 , which is now directed to (a) polynucleotides 
comprising the nucleotide sequence of SEQ ID NO: 1, (b) polynucleotides that encode the 
amino acid sequence of SEQ ID NO: 2, and (c) polynucleotides that are fully complementary 
to (a) or (b). The specification enables one of skill in the art to make and use the nucleic acid 
molecule of SEQ ID NO: 1 (See discussion above). Accordingly, claim 1 and each one of 
claims 4-8, 10, 51-55, 70, and 71, which ultimately depend from claim 1, are enabled. 

B. The Claims Are Supported by an Adequate Written Description 

The examiner alleged that claims containing the phrases "allelic variant or splice 
variant, 11 "an isolated nucleic acid encoding a polypeptide that is at least about 70%, 75%, 
80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to the polypeptide of SEQ ID NO: 
2," "a nucleotide sequence which hybridizes under moderately or highly stringent conditions 
to the complement of. . .., wherein the encoded polypeptide has an activity of the polypeptide 
as set forth in SEQ ID NO: 2," and "at least one conservative amino acid insertion, 
substitution, or deletion" failed to meet the written description requirement. [Office Action, 
paragraphs 5c and 5d]. 

These rejections are now moot. Solely for the purpose of expediting prosecution, the 
applicants have removed these phrases from claim 1 and cancelled claims 2 and 3, without 
prejudice. In light of the foregoing amendments, the applicants request that the rejections 
under 35 U.S.C. §112, first paragraph, be withdrawn. 



IV. The Rejections Under 35 U.S.C. §112, Second Paragraph, Should Be Withdrawn 

In paragraph 6 of the Office action, the examiner rejected claims 1-8, 10, 11, 51-55, 
70, and 71 under 35 U.S.C. §112, second paragraph, as assertedly containing indefinite 
terminology. 
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The examiner alleged that the phrases "an activity of the polypeptide set forth in SEQ 
ID NO: 2" and "hybridizes under moderately or highly stringent hybridization conditions" in 
claim 1 were indefinite. Solely for the purpose of expediting prosecution, the applicants have 
deleted these phrases from the claims. In addition, the examiner further alleged that line 3 of 
claim 1 is improper because the phrase "the nucleotide sequence" lacks antecedent basis. 
While the applicants believe the phrase in question is conventional in patent claims and 
susceptible to only one interpretation, they have amended line 3 to "a nucleotide sequence," 
since such amendment is not narrowing any way. Accordingly, these rejections are rendered 
moot. 

The rejection of claims 2 and 3 are moot, because the claims have been cancelled. 

The examiner rejected claim 8 under §112, second paragraph, as being indefinite for 
the recitation of the term "optionally" in a final isolation step. The applicants have amended 
claim 8 by removing this final step, and added new dependent claim 72, which recites the 
process of claim 8 wherein the polypeptide is isolated. These amendments thereby render 
moot the basis for rejecting claim 8 and its other dependent claims. 

The examiner rejected claims 51 and 53 under §112, second paragraph, as being 
indefinite and vague for the recitation "of claims 1 , 2 or 3." Due to the cancellation of claims 
2 and 3, claims 51 and 53 have been amended to depend solely to claim 1, thereby rendering 
moot the basis for rejecting claims 51 and 53 and its dependent claims. In light of the 
foregoing amendments, the applicants submit that the rejections of claims 1-8, 10, 11, 51-55, 
70, and 71 under 35 U.S.C. §112, second paragraph, for indefiniteness has been overcome 
and should be withdrawn. 

V. The Rejection Under 35 U.S.C. §1 02(b) Should Be Withdrawn 

In paragraph 7 of the Office action, the examiner rejected claims 1-3 alleging that 
these claims were anticipated under 35 U.S.C. § 102(b) by Hillier et aL 9 WashU-NCI Human 
EST Project (1997) (hereafter Hillier et aL). 

The rejection under 35 U.S.C. § 102(b) based on Hillier et al is now moot. Solely for 
purposes of expediting prosecution, the applicants have amended claim 1 to distinguish the 
claimed invention from the disclosure of the cited document and cancelled claims 2 and 3 
without prejudice. The amended claims recite sequences which are neither disclosed nor 
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suggested by the cited document. Therefore, the applicants respectfully request that the 
rejection under § 102(b) be withdrawn. 

VI. The Rejections Under 35 U.S.C. §103(a) Should Be Withdrawn 

In paragraphs 8a and 8b of the Office action, the examiner rejected claims 4-8, 10, 11, 
and 51-53 under 35 U.S.C. §103(a) as allegedly being obvious in light of the disclosure of 
Hillier et aL, and rejected claims 54 and 55 as allegedly being obvious in light of Hillier et aL 
in view of Capon et aL (U.S. Patent Number 5,1 16,964) (hereafter Capon et al.). 

These rejections have been rendered moot by the instant amendment of the relevant 
claims. Claim 1 1 has been cancelled by amendments herein. Amended claims 4-8, 10, and 
51-53 are now directed to either host cells, polynucleotide compositions, or vectors with (a) 
the nucleic acid sequence of SEQ ID NO: 1; or (b) a nucleotide sequence encoded by a 
polypeptide having an amino acid sequence as set forth in SEQ ID NO: 2; or (c) a nucleotide 
sequence fully complementary to (a) or (b) by virtue of their dependency from claim 1 . 
Hillier et aL neither teaches nor suggests the nucleic acid molecule set forth in SEQ ID NO: 1 
nor a nucleotide sequence that encodes for the amino acid sequence as set forth in SEQ ID 
NO: 2. Furthermore, Hillier et aL fails to disclose or suggest a sequence sufficient in length 
to encode applicants 1 polypeptide (in fact Hillier et aL fails to disclose or suggest the 
existence of any encoded polypeptide). One of skill in the art would not be motivated to 
express the nucleic acid sequences of claim 1 in a host according to this teaching. 

The amendment to claim 1 further renders moot the rejection of claims 54 and 55. 
As discussed above, Hillier et aL fails to teach an encoded protein. Capon et aL is cited in 
this rejection only for its teachings related to fusion proteins encoded by a heterologous 
nucleic acid constructs. Since Hillier et aL is silent with regard to encoded proteins, there 
would have been no motivation to combine Hillier et al with Capon et aL 

Accordingly, Hillier et al., alone or in combination with Capon et al., neither disclose 
nor suggest the same DNA within vectors, host cells, DNA compositions, or fusion proteins 
of dependent claims 4-8, 10, and 51-55. Thus, these references do not render claims 4-8, 10, 
and 51-55 obvious under 35 U.S.C. §103(a). 
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SUMMARY 



In view of the amendment and remarks made herein, the applicants believe that claims 
1, 4-8, 10, 51-55, 70, and 71 are in condition for allowance and request notification of the 



same. 



Respectfully submitted, 

MARSHALL, GERSTEIN & BORUN 

6300 Sears Tower 

233 South Wacker Drive 

Chicago, Illinois 60606-6357 

(312) 474-6300 




David A. Gass 



Reg. No: 38,153 
Attorney for Applicants 



January 15, 2003 



APPENDIX A 

VERSION WITH MARKINGS TO SHOW CHANGES MADE 



IN THE SPECIFICATION: 

Please replace the first line of page 1 with the following title: 

Polynucleotides Encoding a CD20/IgE-Receptor Like Molecule and Uses Thereof 

IN THE CLAIMS: 

Please cancel claims 2, 3, 9, 11-50, 56-69, and 71. 
Please amend claims 1, 5- 7, 51, 53, 54, and 70. 

1 . (Amended) An isolated nucleic acid [molecule] comprising a nucleotide 
sequence selected from the group consisting of: 

(a) [the] a nucleotide sequence set forth [in either SEQ ID NO: 1 or SEQ ID 
NO:3] in SEQ ID NO: 1 ; 

(b) a nucleotide sequence encoding a polypeptide [having] comprising an amino 
acid sequence as set forth in [either SEQ ID NO: 2 or SEQ ID NO: 4] SEP ID NO: 2 ; and 

(c) a nucleotide sequence fully complementary to [any of (a) - (c)] (a) or (b) . 

4. (Amended) A vector comprising the nucleic acid [molecule] of [claims 1, 2, 
or 3] claim 1 . 

8. (Amended) A process of producing a CD20/IgE-receptor like polypeptide 
comprising culturing the host cell of claim 5 under suitable conditions to express [the 
polypeptide, and optionally isolating the polypeptide from the culture.] a CD20/IgE-receptor 
like polypeptide encoded by the nucleic acid. 

10. (Amended) The process of claim 8, wherein the [nucleic acid molecule] 
vector further comprises a heterologous promoter [DNA other than the promoter DNA for the 
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native CD20/IgE-receptor like polypeptide] operatively linked to the [DNA] nucleotide 
sequence encoding the CD20/IgE-receptor like polypeptide. 

5 1 . (Amended) A composition comprising a nucleic acid [molecule] of [claims 
1, 2, or 3] claim 1 and a pharmaceutical^ acceptable formulating agent. 

52. (Amended) A composition of claim 51 wherein said nucleic acid 
[molecule] is contained in a viral vector. 

53. (Amended) A viral vector comprising a nucleic acid [molecule] of [claims 
1, 2, or 3] claim 1 . 

54. (Amended) A fusion polypeptide comprising [the polypeptide of claims 13, 
14, or 15] an amino acid sequence encoded by the nucleic acid sequence of claim 1 fused to a 
heterologous amino acid sequence. 

70. (Amended) A [polynucleotide] nucleic acid according to [any one of claims 
1 to 3] claim 1 attached to a solid support. 

72. (New) The process of claim 8 further comprising isolating the 

polypeptide from the culture. 
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APPENDIX B 

ELECTED CLAIMS STILL PENDING UPON ENTRY OF THE FOREGOING 

AMENDMENT 

1 . (Amended) An isolated nucleic acid comprising a nucleotide sequence 
selected from the group consisting of: 

(a) a nucleotide sequence set forth in SEQ ID NO: 1; 

(b) a nucleotide sequence encoding a polypeptide comprising an amino acid 
sequence as set forth in SEQ ID NO: 2; and 

(c) a nucleotide sequence fully complementary to (a) or (b). 

4. (Amended) A vector comprising the nucleic acid of claim 1 . 

5. (Amended) A recombinant host cell comprising the vector of claim 4. 

6. (Amended) The host cell of claim 5 that is a eukaryotic cell. 

7. (Amended) The host cell of claim 5 that is a prokaryotic cell. 

8. (Amended) A process of producing a CD20/IgE-receptor like polypeptide 
comprising culturing the host cell of claim 5 under suitable conditions to express a 
CD20/IgE-receptor like polypeptide encoded by the nucleic acid. 

10. (Amended) The process of claim 8, wherein the vector further comprises a 
heterologous promoter operatively linked to the nucleotide sequence encoding the CD20/IgE- 
receptor like polypeptide. 

51. (Amended) A composition comprising a nucleic acid of claim 1 and a 
pharmaceutical^ acceptable formulating agent. 

52. (Amended) A composition of Claim 5 1 wherein said nucleic acid is 
contained in a viral vector. 

53. (Amended) A viral vector comprising a nucleic acid of claim 1 . 
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54. (Amended) A fusion polypeptide comprising an amino acid sequence 
encoded by the nucleic acid sequence of claim 1 fused to a heterologous amino acid 
sequence. 

55. (Amended) The fusion polypeptide of claim 54 wherein the heterologous 
amino acid sequence is an IgG constant domain or fragment thereof. 

70. (Amended) A nucleic acid according to claim 1 attached to a solid support. 

72. (New) The process of claim 8 further comprising isolating the polypeptide 
from the culture. 
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Cytogenetics and Molecular Genetics, Women's and Children's Hospital Adelaide, South Australia 5006, Australia 



Received November 27, 2000 



CD20 and the /3 subunit of the high affinity receptor 
for IgE (FceRI/3) are related four-transmembrane mol- 
ecules that are expressed on the surface of hematopoi- 
etic cells and play crucial roles in signal transduction. 
Herein, we report the identification and characteriza- 
tion of a human gene, TETM4, that encodes a novel 
four-transmembrane protein related to CD20 and 
FceRIjS. The predicted TETM4 protein is 200 amino 
acids and contains four putative transmembrane re- 
gions, N- and C-terminal cytoplasmic domains, and 
three inter-transmembrane loop regions. TETM4 
shows 31.0 and 23.2% overall identity with CD20 and 
FceRIj3 respectively, with the highest identity in the 
transmembrane regions, whereas the N- and C-termini 
and inter-transmembrane loops are more divergent. 
Northern blot and RT-PCR analysis suggest that 
TETM4 mRNA has a highly restricted tissue distribu- 
tion, being expressed selectively in the testis. Using 
fluorescence in situ hybridization and radiation hy- 
brid analysis, the TETM4 gene has been localized to 
chromosome llql2. The genes for CD20 and FceRI/3 
have also been mapped to the same region of chromo- 
some 11 (llql2-13.1), suggesting that these genes have 
evolved by duplication to form a family of four- 
transmembrane genes. TETM4 is the first nonhemato- 
poietic member of the CD20/Fc€RIj3 family, and like its 
hematopoietic-specific relatives, it may be involved in 
signal transduction as a component of a multimeric 

receptor Complex. © 2001 Academic Press 

Key Words: four-transmembrane; TM4SF; tetra- 
spanin; CD20; FceRI/3; testis; gene localization; chro- 
mosome 11; signal transduction. 
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CD20, the j3 subunit of the high affinity receptor for 
IgE, and HTm 4 , comprise a family of related proteins 
that contain four membrane spanning regions. All 
three proteins are expressed specifically in hematopoi- 
etic cells; CD20 on B cells (1), FceRI/3 on mast cells and 
basophils (2), and HTm 4 on cells of myeloid and lym- 
phoid origin (3). Both CD20 and FceRI/3 have been well 
characterized as playing important roles in initiating 
signal transduction events as components of multi- 
meric receptor complexes. CD20 has been shown to 
have the capacity to regulate B cell proliferation and 
differentiation as part of a large cell surface complex 
with MHC-I, MHC-II, CD40, and the tetraspanins 
CD53, 81 and 82 (1, 4). FceRI/3 is a key component of 
the tetrameric a/3y 2 FceRI complex on mast cells and 
basophils, and plays a crucial role in enhancing cell 
surface expression of the complex and amplifying sig- 
nal transduction events mediated upon the interaction 
of receptor-bound IgE with multivalent allergen (2, 5, 
6). The functional role of HTm„ is unknown, but as for 
CD20 and FceRI/3, it is likely to contribute to the sig- 
nalling of a multimeric receptor complex. In this study, 
we report the isolation, tissue distribution and chromo- 
somal localization of a human gene that encodes a 
novel member of the CD20/FceRI/3/HTm 4 family. 

MATERIALS AND METHODS 

Isolation of RNA and first strand cDNA synthesis. Total cellular 
RNA was prepared by homogenising 100 mg of tissue in 1 ml of Trizol 
reagent (Gibco-BRL. Grand Island. NY), upon which the aqueous 
fraction was recovered and RNA precipitated using isopropanol. 
First strand cDNA was produced from 5 ^g of total RNA by priming 
with an oligo dT primer (NotdT. 5' -AACTGGAAGAATT- 
CGCGGCCGCAGGAAT ,K; 3') using a First Strand cDNA synthesis 
system (Pharmacia Biotech. Uppsala. Sweden) according ro the man- 
ufacturer's instructions. 
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PCR and nucleotide sequence analysis. PCR was performed on 1 0 
ng of first strand cDNA in the presence of 100 ng of each oligonucle- 
otide primer, 1.25 mM dNTPs. 50 mM KCl 10 mM Tris-HCl pH 8.3 
and 1.5 mM MgCl 2 , and 1 unit of Taq DNA polymerase {Gibco-BRL. 
Gaithersburg, MD) for 35 amplification cycles. 3'-RACE was per- 
formed by PCR as described above with the oligonucleotide primer 
TET-1 (5'-GTCATCTCCTTTCAAATTATCAC-3\ hybridizes to nu- 
cleotides 24-46 of the TETM4 cDNA) and the NotdT primer (see 
above). Nucleotide sequencing was performed by direct sequencing 
of amplified cDNA fragments using an Applied Biosystems 377 
sequencer. 

Northern blot analysis. Northern analysis of multiple human 
tissue blots (Clonetech. Palo Alto, CA) was performed by probing 
membranes with the full length TETM4 cDNA, labelled by random 
priming (Megaprime DNA labelling system; Amersham, Bucking- 
hamshire. UK), using Expresshyb solution (Clonetech, Palo Alto, CA) 
as specified by the manufacturers. Membranes were washed in 1 x 
SSC for 40 min at room temperature followed by 0, 1 x SSC for 40 min 
at 60°C and exposed to X-ray film. 

Southern blot analysis. 10 jug of genomic DNA was digested with 
a range of restriction enzymes, separated on a 1% agarose gel, and 
transferred to a Hybond-N nylon filter (Amersham. Buckingham- 
shire, UK). The blot was probed with the full length TETM4 cDNA 
labelled by random priming and hybridized in a 50% formamide. 6x 
SSC, 0.5% SDS. 5X Denhardt's solution and 100 /ag/ml salmon 
sperm DNA at 42°C. The membrane was washed in IX SSC for 40 
min at room temperature followed by 0.1 x SSC for 40 min at 65°C 
and exposed to X-ray film. 

Fluorescence in situ hybridization. A 1 100-bp genomic fragment 
of the TETM4 gene, produced by PCR amplification with oligonucle- 
otide primers TET-2 (5'-TTCAACTCAAAGCCCCTTGC-3\ hybrid- 
izes to nucleotides 155-174 of the TETM4 cDNA) and TET-4 (5'- 
CCTTGGATATGGTTTTAACAAAG-3' , nucleotides 290-268), was 
nick-translated with biotin- 1 4-dATP and hybridized in situ at a final 
concentration of 15 ng/ml to metaphases from two normal males. The 
fluorescence in situ hybridization (FISH) method was as previously 
described (7). with the exception that chromosomes were stained 
before analysis with both propidium iodide (as counterstain) and 
diaminophenylindole (DAPI) (for chromosome identification). Images 
of metaphase preparations were captured by a cooled CCD camera 
using the chromoScan image collection and enhancement system 
(Applied Imaging Int. Ltd.). 

Radiation hybrid analysis. The TETM4 gene was mapped using 
the medium resolution Stanford G3 panel of 83 clones. Screening of 
the panel was performed by PCR amplification of a 1 100-bp TETM4- 
specific fragment using oligonucleotide primers TET-2 and TET-4 
(see above). Amplifications were performed on 10 ng of each sample 
DNA under the above conditions. 



RESULTS AND DISCUSSION 

Identification and Isolation of the TETM4 cDNA 

In order to investigate the possible existence of ad- 
ditional novel members of the CD20/FceRI)3/HTm 4 
family, the human dbEST (public expressed sequence 
tags, GenBank database) was searched with a consen- 
sus peptide sequence corresponding to a conserved re- 
gion of the second transmembrane (TM) region of the 
three known human family members (GYPFWGAIFF- 
SISG) (3, 8, 9). A number of ESTs were identified, 
all from testis libraries (GenBank Accession Nos. 
AI149899, AA416972, AA411806, AA707529, AA470059, 
AA436088, AA781801, AI002083, AA435988), which 



contained a region homologous to the conserved search 
peptide. Sequence analysis of the ESTs suggested that 
they were fragments of a single gene which was related 
to, yet distinct from, CD20/FceRI/3/HTm 4 . The EST 
sequences were assembled into a single contig of 695 
bp, and examination of the compiled sequence, sug- 
gested an open reading frame that encoded for a puta- 
tive protein of 200 amino acids. An oligonucleotide 
primer was designed to the predicted 5' untranslated 
region of the cDNA (TET-1, 5'-GTCATCTCCT- 
TTCAAATTATCAC-3', hybridizes to nucleotides 24-46 
of the TETM4 cDNA) and used in 3' rapid amplification 
of cDNA ends (RACE)-PCR with the oligo-dT primer 
NotdT (see Materials and Methods), on first-strand 
cDNA generated from human testis total RNA. A prod- 
uct of 707 bp was amplified, that upon direct sequenc- 
ing, was determined to encode the complete coding 
region predicted from the EST contig, confirming that 
the cDNA sequence was derived from a single mRNA. 
The cDNA was also cloned into the vector pCR2.1 (In- 
vitrogen, Carlsbad, CA), and multiple clones were ana- 
lysed, which revealed an identical sequence to that 
determined from direct sequencing of the PCR product. 
The nucleotide and deduced amino acid sequence en- 
coded by the full length cDNA, designated TETM4 (for 
testis expressed transmembrane-4, see below), is 
shown in Fig. 1. 

The complete TETM4 cDNA is 695 bp long and con- 
tains a canonical polyadenylation signal sequence at 
nucleotides 669-673 (Fig. 1). The cDNA encodes for a 
deduced protein of 200 amino acids with a predicted 
molecular weight of 22.2 kDa. Hydropathy analysis 
indicates the presence of four hydrophobic regions that 
represent four putative transmembrane domains. Us- 
ing the TMpred program (http://www.ch.embnet.org/ 
software/TMPRED_form.html), which predicts mem- 
brane spanning regions and their orientation, TETM4 
is predicted to have four strong transmembrane helices 
which are likely to adopt a membrane topology with 
both the N- and C-termini intracellular. On the basis of 
this prediction, the TETM4 protein can be divided into 
the following domains; four transmembrane domains 
{TM-1, TM-2. TM-3 and TM-4) of 22, 21, 20 and 22 
amino acids respectively, N- and C-terminal cytoplas- 
mic domains of 48 and 18 amino acids, respectively, 
two extracellular loops of 14 and 22 amino acids and a 
short intracellular loop of 13 amino acids (Fig. 1). Sig- 
nificantly, both CD20 and FceRl/3 have been shown 
experimentally to have a topology on the cell surface as 
predicted here for TETM4 (1, 2). However, it should be 
noted that it remains possible that TETM4 may adopt 
an alternate topology with both the N- and C-termini 
extracellular. Furthermore, it also needs to be consid- 
ered that TETM4 may not be expressed on the cell 
surface but instead is localized on a subcellular mem- 
brane(s). 
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- 5 6 GACTAGACTGAAGTACCAACTAAGTCATCTCCTTTC AAATTATCACCGACACCATC 

1 ATGGATTCAAGCACCGCACACAGTCCGGTGTTTCTGGTATTTCCTCCAGAAATCACTGCT 
1MDSSTAHSPVFLVFPPEITA 

61 TCAGAATATGAGTCCACAGAACTTTCAGCCACGACCTTTTCAACTCAAAGCCCCTTGCAA 

21 SEYESTELSATTFSTQSPLQ 

121 AAATTATTTGCTAGAAAAATGAAAATCTTAGGGACTATCCAGATCCTGTTTGGAATTATG 

41 KLFARKMK ILGTIOILFGIM 

181 ACCTTTTCTTTTGGAGTTATCTTCCTTTTCACTTTGTTAAAACCATATCCAAGGTTTCCC 

61 TFSFGVIFLF TLLKPYPRFP 

241 TTTATATTTCTTTCAGGATATCCATTCTGGGGCTCTGTTTTGTTCATTAATTCTGGAGCC 

81 F I F h SGYPFWGSVL E I N S Q & 

301 TTCCTAATTGCAGTGAAAAGAAAAACCACAGAAACTCTGATAATATTGAGCCGAATAATG 

101 F L I A V KRKTTETLI ILSR I M 

361 AATTTTCTTAGTGCCCTGGGAGC AATAGCTGGAATC ATTCTCCTC AC ATTTGGTTTC ATC 

121 NFLSALGAIAG I I L L T F G F I 

421 CTAGATCAAAACTACATTTGTGGTTATTCTCACCAAAATAGTCAGTGTAAGGCTGTTACT 

141 LDQNYICGYSHQNSQCKAVT 

481 GTCCTGTTCTTGGGAATTTTGATTACATTGATGACTTTCAGCATTATTGAATTATTCATT 

161 V h F h Q I frlTfr MTFS I I g h F I 

541 TCTCTGCCTTTCTC AATTTTGGGGTGCC ACTCAGAGGATTGTGATTGTGAAC AATGTTGT 

181 S U PFSILGCHSEDCDCEQCC 

601 TGACTAGC ACTGTGAGAATAAAGATGTGTTAAAATAAAAAA 



FIG. 1. Nucleotide and deduced amino acid sequence of human TETM4. The nucleotide sequence is numbered with the first nucleotide 
of the translational initiation codon as +1. The amino acid sequence is presented below the nucleotide sequence in single letter code, with 
the four putative transmembrane regions underlined. The predicted initiation codon and stop codon are in bold type and the polyadenylation 
signal sequence is underlined. GenBank Accession No. AF321127. 



Analysis of the TETM4 Amino Acid Sequence 

The alignment of the predicted amino acid sequence 
of TETM4 to that of human CD20 (8), FceRIjS (9), and 
HTm 4 (3), indicates an overall identity of 31.0% (55.4% 
similarity), 23.2% (47.0%) and 26.4% (51.9%), respec- 
tively (Fig. 2). The identity of TETM4 to CD20/Fc€RIj3/ 
HTm 4 is highest in the transmembrane regions, with 
the N- and C-termini and intra-transmernbrane loop 
regions showing little homology. TETM4 contains a 
number of charged/polar residues in its TM regions, 
including a glutamine residue (Q54) in the first TM 
domain, an asparagine in each of the second (N97) and 
third (N121) TM domains, and a glutamic acid (El 77) 
in the forth TM domain. All of these residues, with the 
exception of N97, are also conserved in CD20/FceRIj3/ 
HTm 4 (Fig. 2). Interestingly, charged/polar residues 
are also common in the transmembrane regions of 
other multi-membrane spanning proteins such as the 
tetraspanins, which like CD20 and FceRIjS, associate 



with other membrane molecules (10). Other interesting 
structural features of the TETM4 protein include two 
cysteine residues in its second extracellular loop region 
(CI 47 and CI 56), which are also found in CD20/ 
Fc€RIj3/HTm 4 , and may be involved in forming an 
intra- or inter-chain disulphide bond(s). The 
C-terminal cytoplasmic tail jof TETM4 contains a 
cluster of five cysteine residues in a 12 residue 
stretch (C189, 194. 196, 199|and 200), which may 
also be involved in disulphide bond formation. 

Identification of a TETM4 Splice Variant 

The PCR amplification of the TETM4 cDNA with 
oligonucleotide primers TET-1 and NotdT (see Ma- 
terials and Methods) from human testis also led to 
the isolation of a putative splice variant. Nucleotide 
sequence analysis indicated that this cDNA is 542 bp 
long and is identical to the TETM4 cDNA, however, 
it is missing the coding region for the third trans- 
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FIG. 2. Amino acid alignment of TETM4 with CD20, FceRI/3 and HTm 4 . The amino acid sequences are presented in single letter code. 
The alignment was performed using CLUSTALW (GCG package) and adjusted manually. Gaps (-) have been introduced to maximise 
alignment of the sequences. Identical or similar residues between at least two sequences are shaded in black or grey, respectively. Similar 
residues are denned as: D. E (acidic); A, G, I. L, V (aliphatic); N, Q (side-chain containing amide group; F, W, Y (aromatic); R. H, K (basic); 
S. T (side-chain containing hydroxyl group). The positions of the four putative transmembrane regions for TETM4 and FceRIj3 are overlined 
and underlined, respectively. The predicted TM regions for HTm^ and CD20 are very similar to that shown for FceRIj3, with the exception 
that CD20 contains a continuous hydrophobic stretch between TM-1 and TM-2. The GenBank Accession Nos. are: TETM4, AF321 127; CD20, 
AAA35581; FceRI/3, AAA60269; HTm4, AAA62319. 



membrane and second extracellular domains (nucle- 
otides 394 to 547 encoding amino acids LI 13 to 
F163). This splice variant is also represented in the 
EST database by two clones (GenBank Accession 
Nos. AA411806 and AA781801). The deduced 
polypeptide encoded by this cDNA would contain 
only three transmembrane regions, and would there- 
fore be predicted to have a membrane topology with 
the C-terminal domain extracellular, as opposed to 
intracellular for the four-transmembrane form. A 
putative TETM4 protein with this different topology 
would be likely to have an altered function. The lack 
of the fourth transmembrane region may influence 
possible association(s) with other membrane mole- 
cules, and the shifting of the C-terminal domain from 
intracellular to extracellular may change any poten- 
tial signalling capacity mediated through interac- 
tions with intracellular signalling proteins. Clearly, 
it would be of interest to determine if this variant 
encodes for a functional protein. 



Tissue Distribution of TETM4 mRNA 

The tissue distribution of TETM4 mRNA was inves- 
tigated by Northern blot analysis of a range of human 
tissues. A strong band centered around 0.7 kb was 
detected only in testis and not in spleen, thymus, pros- 
tate, ovary, small intestine, colon, peripheral blood leu- 
kocyte, heart, brain, placenta, liver, lung, skeletal 
muscle, kidney or pancreas (Fig. 3). Prolonged expo- 
sure of the Northern blot failed to reveal any signifi- 
cant signal in any tissue other than testis. Reverse- 
transcriptase (RT)-PCR analysis of TETM4 mRNA was 
also performed on first strand cDNA made from mRN A 
isolated from the above range of human tissues. Am- 
plification with oligonucleotides TET-2 (see Materials 
and Methods) and TET-3 (5'-CAGTAACAGCCTTA- 
GACTGAC-3\ hybridizes to nucleotides 537 516 of the 
TETM4 cDNA) produced the expected product of 383 
bp only in testis, but not any other tissue (data not 
shown). These data suggest that TETM4 shows an 
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FIG. 3. Northern blot analysis of TETM4 mRNA expression. 
Multiple tissue Northern blot filters (Clonetech, Palo Alto, CA) were 
hybridized with 32 P-labelled full-length TETM4 cDNA in Ex- 
pressHyb solution (Clonetech, Palo Alto, CA) as specified by the 
manufacturer. The filters were rehybridized with a control 32 P 
|3-actin cDNA and show approximately equal amounts of mRNA 
loaded per lane; heart and skeletal muscle have two p-actin tran- 
scripts. The positions of molecular weight makers (in kilobases) are 
indicated. Exposure times were 12 h for both TETM4 and /3-actin. 
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ford G3 panel of 83 clones. Screening of the panel by 
PCR amplification of a 1100 bp r£TM4-specific frag- 
ment using oligonucleotide primers TET-2 and TET-4 
(see Materials and Methods), indicated that TETM4 is 
most closely associated with the Stanford Human Ge- 
nome Centre marker SHGC-20674, with a LOD score 
of 13.23 (data not shown). SHGC-20674 is not ordered 
on a Stanford map; however, Stanford has linked it to 
marker SHGC-35409 which is ordered on the Stanford 
Radiation Hybrid Map. The markers most closely as- 
sociated with TETM4 are flanked by markers Dl IS 33 5 
and D11S4363. Searches of the Cytogenetic Yac Bank 
(http://sgiweb.ncbi.nlm.nih.gov/Zjing/yac.html) placed 
the flanking markers D11S1335 and Dl 1S4363 on Yac 
WC11.5 which spans the llql2 cytogenetic band. The 
genes for CD20, FceRI/3 and HTm 4 have also been 
mapped to the same region of chromosome 11 (1 1 q 1 2- 
13) (3, 11, 12). These data suggest that the TETM4, 
CD20, FceRI/3, and HTm 4 genes have evolved by dupli- 
cation and divergence of the same ancestral gene to 
form a family of four- transmembrane genes. 

The tetraspanins comprise a distinct family of four- 
transmembrane molecules that are expressed on both 
hematopoietic and non-hematopoietic cells. In contrast 
to the CD20/FceRIj3/HTm 4 /TETM4 family, the tet- 
raspanins appear to form a far more extensive family 
and are found in species ranging from schistosomes to 
humans. At least 20 members have been described in 



extremely specific tissue distribution, being found only 
in the testis. The identification of TETM4 ESTs derived 
only from human testis libraries, also supports the 
proposed testis specific expression. Thus, in contrast to 
CD20/Fc€RIj3/HTm 4 , which are all hematopoietic spe- 
cific, TETM4 is the first member of this family that is 
expressed in a non-hematopoietic tissue. 

Southern Blot Analysis and Chromosomal 
Localization of the TETM4 Gene 

Southern blot analysis was performed on restricted 
human genomic DNA using the full length TETM4 
cDNA as a probe. A simple banding pattern was pro- 
duced that is consistent with the TETM4 gene being a 
single copy gene (Fig. 4). To determine the chromosome 
localization of the TETM4 gene, fluorescence in situ 
hybridization (FISH) was performed on metaphase 
chromosomes of two normal males using a 1100-bp 
TETM4-specihc genomic fragment as a probe. Twenty 
metaphases from the first male were examined for a 
fluorescent signal, which was present in all 20 met- 
aphases in the region Ilql2-llql3, with 57% of the 
signal located in the central portion of band 1 lql2 (Fig. 
5). Similar results were obtained from hybridizations 
of the probe to metaphases from the second normal 
male. Radiation hybrid mapping of the TETM4 gene 
was also performed using the medium resolution Stan- 
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FIG. 4. Southern blot analysis of the TETM4 gene. 10 jig of 
human genomic DNA was isolated from two male individuals (A) and 
(B). restricted with a range of enzymes, and Southern analy.^i^ per- 
formed by hybridizing with random primed ' !■' !.ibt-i!c*<i i I: fM4 
cDNA in "50% fnrmamide. 6x SSC. 0.5% SDS and :i Oniimrdt's 
solution. The blot was washed under high stringency i-ondirions and 
exposed to X-ray film. The positions of molecular weight makers {in 
kilobases) are indicated. 
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FIG. 5. Chromosomal localization of the TETM4 gene by FISH. 
Partial metaphases are displayed showing FISH with a TETM4 
probe. Normal male chromosomes have been counterstained with 
DAPI. Hydridization sites on chromosome 11 are indicated by ar- 
rows. FISH signals and the DAPI banding pattern have been 
merged. 

the human, including CD9, CD37, CD53, CD81, and 
CD82 (10). It is possible that like the tetraspanins, the 
CD2O/Fc€RI0/HTm 4 /TETM4 family may be much 
larger. Indeed, we have recently identified a number of 
additional family members that we are currently char- 
acterising (M. Hulett, manuscript in preparation). 

The testis-specific expression of TETM4 raises some 
intriguing questions as to its function. As described 
above, CD20 and FceRIj3 are expressed specifically on 
hematopoietic cells where they form components of 
multimeric cell surface receptor complexes, and play 
important roles in signal transduction (1, 5, 6). It is 
therefore tempting to speculate that TETM4 may also 
associate with receptor complexes on the surface of 
specific cells in the testis and participate in signalling 
events. Clearly, to address these possibilities and to 
delineate the function of TETM4, further fundamental 
issues need to be addressed such as determining the 
cellular and subcellular distribution of TETM4 in the 
testis. These studies are currently in progress. 
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CD20, high-affinity IgE receptor p chain (FceRI/J). 
and HTm4 are structurally related cell-surface pro- 
teins expressed by hematopoietic cells. In the current 
study 16 novel human and mouse genes that encode 
new members of this nascent protein family were iden- 
tified. All family members had at least four potentia 
membrane-spanning domains, with N- and C-terminal 
cytoplasmic domains. This family was therefore 
named the membrane-spanning 4A gene family, with 
at least 12 subgroups (MS4A1 through MS4A12) cur- 
rently representing at least 21 distinct human and 
mouse proteins. Each family member had unique pat- 
terns of expression among hematopoietic cells and 
nonlymphoid tissues. Four of the 6 human MS4A genes 
identified in this study mapped to chromosome llqlZ- 
q!3.1 along with CD20, FceRIp, and HTm4. Thus, like 
CD20 and FceRI/J, the other MS4A family members are 
likely to be components of oligomeric cell surface 
complexes that serve diverse signal transduction 

functions. « 2001 Academic Press 



INTRODUCTION 

CD20 high-affinity IgE receptor 0 chain (FceRI/3), 
and HTm4 are three cell surface proteins expressed by 
hematopoietic cells that represent members of a nas- 
cent gene family (Adra et al, 1994; Kinet, 1999; Tedder 
and Engel 1994). The deduced amino acid sequence ol 
human and mouse CD20 first demonstrated a cell- 
surface protein containing four membrane-spanning 
regions N- and C-terminal cytoplasmic domains, and 
an ~50-amino-acid loop that serves as the extracellu- 
lar domain (Einfeld etal, 1988; Stamenkovic and Seed 
1988; Tedder etal, 1988a,b). Human CD20 shares 20 /o 

Sequence data from this article have been deposited with . the 
CenBank Data Library under Accession Nos. AF237905-237918, 
AF280401. and AF286866. 
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duke.edu. 



amino acid sequence identity with FceRI/3 and HTm4 
(Adra etal , 1994; Kuster etal, 1992). Moreover, these 
three proteins have a similar overall structure in hu- 
man mouse, and rat with significant sequence identity 
within the first three membrane-spanning domains 
(Kinet etal, 1988; Ra etal, 1989; Tedder etal, 1988a). 
In addition, all three genes are located in the same 
region of human chromosome Ilql2-ql3.1 (Adra etal 
1994- Hupp et al, 1989; Tedder et al, 1989a) and 
mouse chromosome 19 (Hupp etal, 1989; Tedder et al, 
1988a). These three genes are therefore likely to have 
evolved from a common precursor. 

Despite structural and sequence conservation be- 
tween CD20, FceRI/3, and HTm4, transcription of each 
gene is differentially regulated. CD20 is expressed only 
by B lymphocytes (Stashenko etal, 1980; Tedder et al. 
1988a) FceRI/3 is expressed by mast cells and ba- 
sophils (Kinet, 1999). HTm4 is expressed by diverse 
lymphoid- and myeloid-origin hematopoietic cells 
(Adra et al, 1994). Although the function of HTm4 
remains unexplored, CD20 is functionally important 
for regulating cell cycle progression and signal trans- 
duction in B lymphocytes (Tedder and Engel, 1994). 
Moreover, CD20 forms a homo- or perhaps heterotet- 
rameric complex that regulates Ca 2 ^ conductance by 
either forming or serving as a functional component of 
a Ca 2+ -permeable cation channel (Bubien et al, 1993; 
Kanzaki et al, 1995, 1997a,b). FceRI/3 is part of a 
tetrameric receptor complex consisting of a, /3, and two 
7 chains (Blank et al, 1989). FceRI mediates interac- 
tions with IgE-bound antigens that lead to cellular 
responses such as the degranulation of mast cells. Spe- 
cifically, the FceRI/3 subunit functions as an amplifier 
of FceRiy-mediated activation signals (Dombrowicz et 
al, 1998; Lin et al, 1996). Because of their unique 
structure and sequence homologies. CD20. FceRI/3, and 
HTm4 are likely to share overlapping functional prop- 
erties CD20 and FceRI/3 are also important clinically, 
as antibodies against CD20 are effective in treating 
non-Hodgkin's lymphoma (McLaughlin et al. 1998; 
Onrust et al 1999; Weiner, 1999). Genetic variations 
at chromosome Ilql2-ql3 may also play a role in the 
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pathogenesis of allergic diseases with FceRip repre- 
senting one candidate gene (Adra et al, 1999; Kinet, 
1999) 

Since CD20, FceRI/3, and HTm4 are likely to have 
evolved by duplication of an ancestral gene, other re- 
lated proteins that form additional receptor complexes 
might exist. To address this, we have identified 16 
novel human and mouse proteins that span the mem- 
brane at least four times and share high levels of amino 
acid sequence identity with CD20, FceRIfr and HTm4. 
This finding reveals a new gene family that we have 
designated the MS4A family (membrane-spanning 
4-domain family, subfamily A). Currently this family 
contains at least 12 subgroups (MS4A1 through 
MS4A12, Table 1) that encode at least 21 heretofore- 
unidentified human and mouse proteins expressed by 
hematopoietic cells and diverse cell types in nonhema- 
topoietic tissues. 

MATERIALS AND METHODS 



Database searches and cDNA isolation. Three hundred thirty- 
seven nucleotide sequences obtained from the translated GenBank 
database of expressed sequence tags (ESTs) were assembled into 62 
subgroups of contiguous linear segments based on their overlapping 
sequences and potential for encoding proteins homologous with 
CD20. Based on these subgroups, EST cDNAs (Fig. 1 and data not 
shown) were obtained from the American Type Culture Co lection 
(ATCC Bethesda MD) and sequenced. Based on the complete se- 
quences of 21 near full-length EST cDNAs. 11 novel genes that 
unified multiple EST subgroups were denned in human and mouse. 
Near full-length EST clones representing these genes are shown in 
Fig 1 These 1 1 genes and 5 additional genes were also identified by 
PCR amplification of transcripts using subgroup-specific primers , or 
primers based on EST sequences. The specific deta lis of ho w cDNAs 
representing the 5 genes that were not identified by EST cDNA 
clones are indicated below. In all cases, ESTs and cDNAs encoding 
the predicted coding regions of each putative unique gene were 
sequenced in both directions and at least 2 independent ESTs and/or 
cDNAs representing near full-length gene products were sequenced. 
Thereby, there was independent confirmation of accuracy for all of 
the sequences reported (Fig. 1). 

Based on EST subgroup sequences, cDNAs encoding mouse 
MS4a4B and MS4a4C were isolated by PCR amplification of 
C57BL/6 mouse spleen cDNA using both Taq and Pfu DNA poly- 
merases Primer sets for MS4a4B (sense 5'-CAC GAG GCA CAC 
AAG CAA AGC-3'. antisense 5'-AAG TGC TTG ACT TAG ATA CTT 
ACA G-3') amplified an 879-bp fragment. Primer sets for MS4a4C 
(sense 5'-TGG GTG AGA ACA CAC AAT CAA AAC-3', antisense 
5'-CAC ATA CAC ACA AGA GAA TTA GAC-3') amplified a 794-bp 
fragment. EST sequences for MS4a4D encoded only the 3' end of the 
predicted protein. Since MS4a4D sequences were closely related to 
MS4a4B and MS4a4C sequences, a sense 5' primer (5 -CCA 1 
TCT GTA CTG TTT CTG CTG-3') based on consensus MS4a4B and 
MS4a4C sequences and a MS4a4D-specific antisense primer (5'- 
GCC AAA TGC ATA CAC ATG TGC AC-3') were used to amplify a 
773-bp fragment from cDNA of C57BL/6 mouse lung. 

MS4a6C was initially identified based on one unique ESI se- 
quence (AA028258) encoding a mouse protein homologous wrth the 
C-terminal end of MS4a6B. MS4a6C cDNAs were isolated by PCR 
amplification of C57BL/6 mouse bone marrow cDNA using Taq poly- 
merase. A primer based on identical sequences at ^the *5 enc I of the 
MS4a6B and MS4a6D cDNAs (sense 5'-CTG GAA GTG ACT GGG 
TGA CAA GGC-3') was used in combination with an antisense 
primer specific for the unique EST sequence (5'-CAA TCG TTC CTC 
ATA TGC ACA G-3') to amplify a 787-bp fragment. Sequences from 
multiple independent PCR-amplified cDNAs were identical. Subse- 



quently the PCR-generated 5' end of the near full-length MS4a6C 
cDNA was found to be identical to an orphan EST subgroup sequence 
that had not been linked with defined 3' sequences. Thereby the 
EST subgroup sequences verified that the PCR-amplified 5 end of 
the MS4a6C cDNAs was appropriate. In addition, the overall 
MS4a6C sequence was similar to the sequence of MS4a6B cDNAs 
without interruption. Thus, the MS4a6C cDNA united sequences 
identical to those found in two nonoverlapping CD20-homologous 
EST subgroups. ■■ 
cDNAs encoding a 473-bp fragment of mouse MS4a3 were ampli- 
fied from cDNA of C57BL/6 bone marrow as described above. Primer 
sets (sense 5'-AGA CTC TGG TGG TCA TTA CTG TCT C-3'. anti- 
sense 5'-GAA TGC CAA ATG CAC AGA AAG G-3') were obtained 
based on a single thymic cDNA EST sequence (GenBank AA940479) 
when the corresponding cDNA was not available. 

All PCR-amplified cDNAs were subcloned and sequenced entirely 
in both directions. Complete sequencing of at least two distinct 
PCR-generated cDNAs from both Taq and Pfu enzymes was per- 
formed in most cases. Differences between cDNA sequences were 
noted only when multiple cDNA clones generated by both Taq and 
Pfu polymerases revealed identical differences. In some cases. 
cDNAs or EST sequences contained potential intron/exon splice sites 
that delimited structural domains and aligned with the known in- 
tron/exon splice sites of CD20 (Tedder et ah, 1989b). In these cases, 
potential introns were flanked by consensus splice donor and/or 
splice acceptor sequences (Aebi and Weissmann. 1987) or were likely 
to represent splice variants from which exons were deleted. 

RNA isolation and reverse transcription-PCR. Reverse transcrip- 
tion-PCR amplification was as described previously (Zhou and Ted- 
der 1995) with minor modifications. Total RNA was extracted from 
1-2 X 10 7 cells or frozen tissue using an RNeasy Mini Kit (Qiagen, 
Inc Chatsworth. CA) according to the manufacturer's instructions. 
RNA concentrations were determined by UV absorbance. Ten micro- 
grams of total RNA was reverse transcribed and PCR amplified with 
primers (500 nM) for 40 cycles (94°C for 1 min, 55°C for 1 .5 min, 72°C 
for 1 5 min. followed by extension at 72°C for 5 min). Following 
amplification, the PCR products were separated on 1% agarose- 
ethidium bromide gels and photographed. G3PDH, a housekeep.ng 
gene was also amplified to control for sample-to-sample variation. 
RNA amplified without reverse transcription was used as a negative 
control and was negative in all cases (data not shown). 



RESULTS 

Identification of CD20 gene family members. To 
identify new CD20 gene family members, the human 
and mouse CD20 amino acid sequences (Tedder et al.. 
1988a,b) were used to search the translated GenBank 
databases, including expressed sequence tags, using 
the BLAST program (Altschul et al., 1997). Among 337 
homologous sequences identified, at least 17 novel 
genes expressed by mouse, human, and pig had pre- 
dicted amino acid sequences homologous to CD20. 
Complete coding regions were predicted using overlap- 
ping nucleotide sequences obtained from sequenced 
ESTs and cDNAs that corresponded to unique, near 
full-length transcripts in humans and mice (Fig. 1). All 
nucleotide sequences were verified by sequencing mul- 
tiple near full-length cDNAs isolated in our laboratory 
and 40 cDNAs obtained from the ATCC. In addition, a 
pig cDNA and its human counterpart homologous to 
CD20 were identified as GenBank submissions 
AJ236932.1 and AK000224, respectively. In total, 
unique cDNA clones that encoded at least 16 distinct 
full-length CD20-like proteins were identified (Fig. 1). 
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FIG. 1. cDNAs encoded by 15 new human or mouse MS4A gene 
products. Consensus sequences from cDNAs and overlapping ESTs 
are indicated by their GenBank accession numbers. Representative 
full-length cDNAs for each gene product are shown, except for 
MS4a3. which was not full length. 5' and 3' untranslated sequences 
are shown as horizontal lines with relative nucleotide lengths shown. 
Coding regions are shown as boxes with translation initiation and 
termination codons and their relative nucleotide locations are 
shown. Poly(A) attachment signal sequences (AATAAA) are indi- 
cated when known. Deduced hydrophobic regions are shown as filled 
boxes with the predicted membrane-spanning domains shown as 
TM1-TM4. Additional hydrophobic regions in MS4A4 proteins are 
shown as shaded boxes. Sites of potential nucleotide polymorphisms 
in MS4A6A are indicated by two X's, 

In collaboration with the Human Gene Nomencla- 
ture Committee (www.gene.ucLac.uk/nomenclature/), 
this gene family was designated the MS4A family 
(membrane-spanning 4-domain family, subfamily A). 
The MS4 designation is to accommodate the future 



identification of genes encoding proteins with a similar 
structure, yet with unresolved functions. Subfamily A 
will designate the CD20 family. Using this nomencla- 
ture, the CD20 gene was designated MS4A1, FceRIfi 
MS4A2, and HTm4 MS4A3. Among the 16 novel genes 
identified, 6 human genes were named MS4A4A, 
MS4A5, MS4A6A, MS4A7, MS4A8B, and MS4A12. 
The remaining genes were of mouse or pig origin and 
were therefore labeled MS4a3-MS4al2 based on the 
nomenclature of homologous genes corresponding to 
human counterparts (Table 1). Distinct mouse gene 
products that encoded proteins with highly homolo- 
gous sequences were designated MS4a4B f MS4a4C, 
and MS4a4D and MS4a6B, MS4a6C, and MS4a6D to 
signify close homologies. 

Structures of MS4A family members. Complete 
coding region sequences were verified for each deduced 
protein, except for the MS4a3 protein, which was not 
full length (Fig. 1). Proposed ATG translation initia- 
tion codons were based on the translation initiation 
consensus sequence, ANNATG (Kozak, 1986), and the 
existence of in-frame upstream translation stop codons 
in most cases, Poly(A) attachment signal sequences 
were identified in the 3' untranslated regions of each 
gene product except MS4A6A and MS4a6C. Two 
poly(A) signal sequences were found in MS4a4D, 
MS4A5, and MS4alO transcripts, while four were ob- 
served in MS4A4A transcripts. The MS4A genes en- 
coded proteins of 22-29 kDa (Fig. 2, Table 1). Whether 
the first or second ATG codon in mouse MS4a8B was 
used for translation initiation was unknown although 
the second ATG was identical with the start codon of 
human MS4A8B (Fig. 2). There were no amino-termi- 
nal signal sequences, although all MS4A proteins con- 
tained hydrophobic regions of sufficient length to pass 
through the membrane at least four times (Figs. 1 and 

TABLE 1 
MS4A Family Members 



Human 




Mouse 




Human/mouse 
homology 


Name 


kDa 


Name 


kDa 






MS4a3 




63% (partial) 


MS4A4A 


23 










Ms4a4B 


24 


41% 






Ms4a4C 


24 


44% 






Ms4a4D 


24 


40% 


MS4A5 


22 








MS4A6A 


27 












Ms4a6B 


27 


52% 






Ms4a6C 


24 


51% 






Ms4a6D 


26 


53% 


MS4A7 


26 


MS4a7 


26 


53% 


MS4A8B 


26 


MS4a8B 


29 


63% 






MS4alO 


29 




MS4A12 


26 


MS4al2 (pig) 


26 


60% 



Note. Predicted molecular weights for the new MS4A family mem- 
bers and the percentage amino acid sequence identity between de- 
duced MS4A and MS4a proteins. 
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Al MTTPRNSVNGTFPAEP^GPIAMQSGPKPLFlU^SSLVGPTQSFFMRfiSia^ 

al ^ MSGPP PAEPTKG PLAMQFAPKVNLKRT S SLVG PTQSFFMRE S KA tX5 

A2 / MDTESNRRANLALP QEPS | SVPAFEVLEISPQEVSSGRLI,KSASSPPLHTWLT^^^.EQEF^>G 

a2 MDTENRSRADLALPNPQESS SAPDIELLEASP AKAAPPKC/IWTFfciCK^EFtG 

A3 MASHEVDNAEUSSASAHGTPGSETTOPEEUn'SVYHPINGSPDYQKAKLQVLG 

a3 MKPEETGGSVYQPLDESRHVQRGVLQALG 

A4 A MTTMQGMEQAMPG AGPGVPQLGNMA VI H S H LWKGLQ EKFIiKGEPIWLG 

a 4 3 MQGCjEQTTMAWPGVAVPSKNSVfrfTSQMWNE^^ 

a4c MQC^EQTTMAWPGGAPPSEKSVMKSQMVWENKE^ 

a4D MQGlAQTTMAWPGGAPPSENSVIKSQMWKKNKEKFI^GEP}OpG 

A 5 MDSSTAHSPVPLVFPPEITASEYESTELSATTFSTQSPI^^FARKM^IIi0 

A6A MTSQPVPNETIIVLPSNVIMFSQAEKPEPTNQGQDSLKJ01LHAEIKVIG 

a6B MIPQWTSETVAMISPNGMSLPQTDKPQPFHQWQDSLKraup^IK^^ 

a g c MI PQWTNET I TT IS PNG INFPQKDE S Q PTQQRQDS LI^HtiCABl l^^I V 

a6D MI PQV\TTSElVWISPNGISFPQTDKPQPSHQSODSLKkHI^p^lk^ 

A7 MLLQSQTMGVSHSFTPKGITIPQREKPGH^fYQNEDYLQNGiPTiTT^CG 

a7 MRLQLGTKNI GWDCF PKDI 1 1 HKREKTG HTY EKEDDLLIGVP S EATLLG 

A 8B ^SMTSAVPVAKSVLWAPHNGYPVTPGIMSHVPLYPNSQPQVHLVPGNPPSLVSlWNGQPVQKAIiKiGKT^ 

a 8B KEPEQERLTWQPGTVSMNTVTS PGPMANS VYWAPPN S YPWPGTVPQMPI YPSNQPQVHVI SG HL PG LVP AMTEP PAQRVIJC KGQVliG 

a i o MAGQAPT AVPG SVTGEVS RWQNLG PAQPAQKVAQPQNLVPDGHLEKALEGSDLLQKtG 
A12 MMSSKPTSHAEVNETIPNPYPPGSFmFGFQQPLXSSI^ENQAC^^ 

al 2 MMSSKPTTYPGVYGTTPDLYPPSNFMVPGSQQPPGFINPRIOVQSSQ APFIVSPGIPI^SQQVQGNIQMVNPGTGKAATNF^3|AkTLG 



TM1 

AVQIKNGBFHIALGGLLMI PAG I 76 

ftVj^n4NGIi7HITLGGLl>II PTGV 6 9 

(VltliLTAMICIiCFGTWCSVLDI 35 

AT^ZliVGIilCSC FGTIVC SVLYV 7 7 

Af ^XtKAAMIliALGVFLG SLQYP 7 5 

AIQ£UtfGI LILALG IFLVCLQHV 5 2 

WQILTALKSLSMGITMVCMASK 71 

VLQVMIAIINLSLGIIILTTLF 67 

WQVMIAIilNL^FGIIILANLS 67 

AIQVM I AF INFS LGI I I ILKRV 67 

TIQUjFGIMTFSFGVIFLFTLLK 74 

TIOI^CGMMVLSLGIILA SASFS 72 

klQIMCAVMVLSLGIILASVPSN 72 

A^IQlMCAVTVLAtGIlLASVPPV 72 

AIQIKCAVMVLSEp.ILI AS VPSN 7 2 

TVQII^CCfiLI SSliGAlLVFAPY P 72 

TIQLLCAiilLASFGGILVSASY 71 

XfQilliGL.AHIGLGSIMATV^VG 96 

A?QItIGtVHIGLGSIMITNLFS 112 

GFHiAIAFAHliAFGGYLISTVKN 81 

VlQlMVGtiMH IG FGIVtiC LISFS 115 

Ai^IIilGSKH IGF£I ILGLMGRT 1 1 2 







Al 


YAPICVTV 


al 


FAPICIiSV 


A2 


SHIEGDIF5SFKA 


a2 


SDFDEEVLL|YKL 


A3 


YHFQKHFFFFTFYT 


a3 


SHHFRHFFFFTFYT 


A4A 


TYGSNPllSVYI 


a4B 


SELPTSVM^ 


a4C 


SEPLISWIJ 


a4D 


SERFMSVUi 


A5 


PYPRFPFIFLS 


A6A 


PNFTQVTSTLIiNS 


a6B 


LHFTSVFSVLiiKS 


a6C 


PYFNSVFSVLi/KS 


a6D 


LHFTSVFSIL&ES 


A7 


SHFNPAISTTLMS 


a7 


FNPEVSTTLIS 


A8B 


EYLSISFYG 


a8B 


HYTPVSLYG 


alO 


LHLWtjKC 


A12 


FREVLGFASTAVIG 


a!2 


YMQVLGFASLAFVS 

f 



■ J [ TM3- ] 

VW&JWGGIM | YliSGSIiLiAiS TEKNSRKCIi | j^GKMIMNSL^LFA^ISj^pL^ 1 8 2 

V^tL&fcGIM YlJIBiiiLApAEiTSRRsi pKAKVIMS SttfjL FA^I SpI^LS I MDILNMTLSHFLKMRRLEL I QTS KPYVD I Y DC EP SN S S EKN S P S TQ Y 17 6 
$mmGAIT | ^^^mXSElixm'yi j^G^£GA|)TA^IAGGTpITlglINLKKSL 
j^lSpAVL EvL^FjpilSERiN |LY| ^G|gGA^|vpiAgGTGIAMIiILNLTNNF 
pYPI^AVF F"cS|^TprVVftGliP |RTW IQr^^GMpA#ATI^LV(3TAFLSLNIAVNI 
P^I^AVF ijgSSdStrTV^GRNP Ir1$ MQN^FGljSCtASTT I AFVGTVFfc SVHLAFN * 
biTIpSVM l^ilGsis^GIRT TKGli ^G^Ml^TSSVLkASGILINTFSLAFYS 
MVf IWGSIM J |^V&0Sii^GVTP TKC£ | IVA^t^IT^liiATASiMGWSVAVGS 
MAPJWGPIM iiv^SlS^SGVKP TRSi|lISl|Tl|rTITSVURATASIMGVVSVAVGS 
LAfi^SIM fiFSGJl^liiGVKP TKAM IIsSsVpTIS§VLAVAAS|IGVISVISGV 
pPKW|SVL EJ-Ni&GAFLlAVKRfr |ET§ I IL^IMS(FLSALG^IA^I^Lj|TFGF IL 
a£p£i|pFF FipGSlJsiA'TEKRL £KL| pS^VGSlLlALS^LvlFilJ^SVKQATLNPASbQ 
GYPFIGALF F*VSGILSIVTETKS TRIE | ^DS^IiTLWItLSVS F&FMGl£l I SVSLAGLH PASEQ 
pfcp*lj£ALF | FIAjjSJGILSI ITERKS TJSPL ^A^TI^livSFAFVGIIIISVSLAGLHPASEQ 
GYP^V^ALF FAI&GILSIVTEKKM TKPIi V.HSSLALSILSVLSALTGIAILSVSLAALEPALQQ 
GYPFLGALC FGITGSLSIISGKQS TKPF DLSSLTSNAVSSVTAGAGLFLLADSKVALR 
GYLFIGSLC F/AIAGILSIISEKIS TKPF ALSS^Sl^ASSVVAVIGLFLFTYCLIALG 
GFPFWGGLW ^IISGSi^AA'ENQPYSYCli USG^iGLN^viAICSAVGVILFITDLSl PH 
GFPFWGGIW FIISGSI^V^AETQPNS PC£ LNGSVGL^I:F|aICSAVGIMLFITDI S I S S 
Vtf^PLWGTVS FLVAGMAAMTTVTF P KTSJj KVLCVIANVI^LFCALAGFFVIAKDLFLEG 
GVlFWGGLS FIISGStSVSASKEL SRC I* ^KGSjkaMKiVS S I LAF I GVJ&ELVDMC I NG 
GY;PFWGGLS FI ITG I iiCi LAS KKS SPAft IKS|EGMSXVpFFAFlj&^|LVDESING 



AYIHIHSCQKFFETK 171 
AYMNMCKNVTEDDG 162 
QSLRSCHSSSESPDL 162 

FHHPYCNYYGNSNN 154 
QFP 137 
QFP 137 
FR 136 

DQNYICGYSHQNSQ 155 
CELDKNNIPTRSYVSYFYHDSLYTTD 174 
CLQSKELRPTEYHYYQ FLDRNE 170 
CKQSKELSLIEHDYYQPFYNSDRSE 173 
CKLAFTQLDTTQDAYHFFSPEPLNS 17 3 
TASQKCGSEMDYLSSLPYSEYYYPIYEXKD 173 
SAFPHCNSEKKFLSLLSYLKSHHWKNEDXK 17 0 
PYAYPDYYPY 174 
GYIYPSYYPYQ 191 
PFPWPIWRPYPEPT 161 
VAGQ 191 
LPEQ 188 



TM4 J 

CYSIQSLFL)$IL.SV^iFAFFQELVIAGIVENE^RTCSRPKS|^^ 2 97 

GNSIQSVFL SlLSAMfsAFFQKLVTAGIVENEwiRMCTRSKS NVVL|sAGEKNEQTIKMKEEIIELSGVSSQPKNEEEIEIIPVQEEEEEEAEINFPAPPQEQESLPVENEIAP 291 



CFMASFS j T EIWMl|iiFLTIi;GIiGSAVSl,TICGAGEELKGNK | VPEDRVYEELNI YSATYSELEDPGEMSPPIDL 
g FVASFTT ELVLM^FLTILAFCSAVLFTIYRIGQELESKK VPDDRgYEELNVYSPIYSELEDKGETSSPVDS 
.._ CNYMGSISN GMVSU^fijLTLitl^VTISTIAlWCNANCCNSRE EISSPPNSV 
A4A CHGTMS ILM ^LIX?MVLLLSVL^FClAVSLSA^GC|CVLCCTPGG WLILPSHSHMAETASPTPLNEV 

a4B FRYNYTITK | GLDVLMLli FNMLEFC L^VS VS AJPGC EAS CCNS RE | VLVVtlSNPVETVMAPPMTLQPLLPSEHQGTNVPGNVYKNHPGEIV 
a4C FRYNYTITK GLDIti^iLNMLEFC IAVSISAFGCKASCCNSSE j VLVVt:PSNPAVTVMAPPVTIX3PLPPSEHQGKNl/PENVYKWHSEEIV 
a4D QFRSQPAIA SLDVLHTI LNMLEFC I AVSVS A'F,GcitASCCNSS E ^V^pSNSAVn-VTAPPMILQPLPPSECQGKNVPENLYRNQPGEIV 
A5 GKAVTVLFL JSILITLMTFSII^FISLPFSILGCHSEDCDCEQCC 

A6A CYTAKASLA GTL S L^I CTLLEFCL^VLTA VLRWKQA YS DFPG SVLFLPHSYIGNSGMSSKMTHDCGYBELLTS 
a63 CFAAKSVLA GW Sli^ STMLELGLAVLTAMLWwjcQSH SN I PG NVMFfcPHSSNNDSNMESKVLCNPSYEEQLVC 
a6C GAVTKSILT GALSVMIil I S VLELGLALLS AMLWtiREGVLTSLRM 

a6D CFVAKAALT GVFSU^SSVLELGLAVLTATLWWKQSSSAFSG NV I FXiSQNS KNK S S V S S ESLCN PTY EN I LT S 

A7 CLLTSVSLT G\TLV\^LIFTVLELI J U^AYSSVF^'WkoLYSNNPG SSFSSTQSQDHIQQVKKSSSRSWI 

a7 CYLAYVGAM SALGM^LLFTVLEVFLAGYSSIFWWKQVYSNKPG GTFFtf QSQDHTQLVKSNLLQ 

A8B AWGVNPGM AISGVLliVFCLliEFGIACASSHFGCQLVCCQSSN VSVIYgNIYAANPVITPEPVTSPPSYSSEIQANK 

afiB ENLGVRTGV AISSVLLIFCLLELSIASVSSHFGCQVACCHYNN PGWI : p;| NVYAANPWIPEPPNPIPSYSEWQDSR 

a: 0 TYIQRLELT LFCFTFLEIFLSGSTAITAYRMKRLQAEDKDDTP FVPOTEMELKGLSLGPPPSYKDVAQGHSSSDTGRALATSSGLLLASDSFHQALLHTGPRTLRK 
A12 DYWAV I.SGKGISATLMIFSLLEFFVACATAHFANQANTTTNM SVLVIPNMYESNPVTPASSSAPPRCNNYSANAPK 
al2 DYWAV L SGKG I S AMLI^tF SLtfE FC I TCVTAYFAS KT I TNTRGL SWSFHLCMQTVP 
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FIG. 2. Deduced amino acid sequences for CD20 (human Al, mouse al), FceRI/3 (A2 and a2), HTm4 (A3 and a3). and 16 new MS4A 
(human) and MS4a (mouse and pig) proteins. Gaps were introduced to optimize alignments. Numbers represent predicted residue positions. 
The predicted membrane-spanning regions (TM) are indicated. Predicted intron/exon splice junctions are indicated by vertical bars where 
information was available. Amino acids common to 10 or more proteins are shaded. *Partial sequence for the MS4a3 protein. CD20. FceRl/3. 
and HTm4 sequences and known intron/exon borders are as published (Adra et al, 1994: Kuster et al. 1992; Ra et al. 1989: Tedder et al, 
1988a,b, 1989b). MS4A12 represents a human colon mucosa cDNA sequence (GenBank AK000224) and MS4al2 represents a homologous 
cDNA sequence from pig (GenBank AJ236932). 



2). Notable was a marked clustering of charged resi- 
dues at both ends of the postulated transmembrane 
domains, some of which were highly conserved. In 
some cases, the first and second putative transmem- 
brane domains of MS4A proteins were a continuous 
stretch of hydrophobic amino acids without an obvious 



intertransmembrane hydrophilic bridge. In contrast, 
MS4A4A and MS4A7 had 6 to 7 hydrophilic amino 
acids inserted between the first and the second hydro* 
phobic domains. In human MS4A4A and mouse 
MS4a4B. MS4a4C, and MS4a4D. an extensive hydro- 
phobic region followed the fourth proposed membrane- 
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spanning domain. Whether tjiis region allows the pro- 
tein to traverse the membrane a fifth time is unknown. 
Nonetheless, the overall structure of MS4A family 
members was well conserved. 

Comparisons between CD20 and the predicted 
amino acid sequences for human MS4A4A, MS4A5, 
MS4A6A, MS 4 A 7, MS4A8B, and MS4A 12 revealed 23- 
29% amino acid sequence identity (Fig. 2). The highest 
degree of identity was found in the first three trans- 
membrane domains with multiple regions of conserved 
amino acids. In particular, the amino acid sequences 
LGAXQI and LSLG were common within the first 
transmembrane domain, GYPFWG and FIISGSLS 
were common in the second domain, and SLX 2 NX 2 
SX 3 AX 2 G was found in the third transmembrane do- 
main. The first and second transmembrane domains of 
MS4A8B were 46% identical in amino acid sequence 
with human CD20, 41% identical with Fc€RIj3, and 
39% identical with HTm4, The MS4A4A, MS4A5, 
MS4A6A, and MS4A 7 proteins were most homologous 
in their first and second transmembrane domains with 
the human FceRI/3 chain, with 37-46% amino acid 
sequence identity. There was large variation between 
MS4A proteins in the N- and C-terminal cytoplasmic 
domains. However, Pro residues were significantly 
overrepresented within the N- and C-terminal cyto- 
plasmic domains of most MS4A family members. There 
was some sequence identity in the first potential extra- 
cellular loop that was -13 amino acids in length for 
each protein. In contrast, the second predicted extra- 
cellular loop ranged from 10 to 46 amino acids in length 
with diverse sequences. 

Potential polymorphisms were identified in the 
MS4A6A gene. Two nucleotide substitutions were 
found in cDNA clone ATCC 499 1 8 1 and in 1 3 of 38 EST 
sequences analyzed (Fig. 1). The first substitution was 
at nucleotide 373 and exchanged a C for a T, which did 
not alter the amino acid sequence. The second substi- 
tution resulted in a Ser in place of Thr at amino acid 
185. In addition, a third substitution was found in 4 of 
the 38 EST sequences analyzed, in which a Ser was 
substituted in place of an Ala at amino acid position 
183. This substitution was paired with a Ser-to-Thr 
substitution at amino acid position 185 in half of the 
clones analyzed. These differences most likely repre- 
sent common sequence polymorphisms since they were 
observed in multiple independent cDNA clones. Al- 
though less likely, these differences could represent 
transcripts from distinct genes that are almost identi- 
cal in coding sequence. Other potential polymorphisms 
were observed in other MS4A family members based 
on consistent nucleotide variations found in large num- 
bers of overlapping EST sequences. However, we were 
unable to independently verify the existence of these 
sequence polymorphisms through the isolation and se- 
quencing of independent cDNAs bearing each potential 
polymorphism. 
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FIG. 3. UPGMA (unweighted pair group method using arith- 
metic averages) tree of deduced MS4A and MS4a protein sequences. 
Horizontal tree branch length is a measure of sequence relatedness. 
For example, MS4a4B and MS4a4C are the most similar in se- 
quence, while CD20 (MS4A1) sequences were the most divergent 
from other family members. The MS4al2p sequence was from pig. 
while all other MS4a sequences were from mouse. The UPGMA tree 
was generated using Geneworks version 2.0 (IntelliGenetics, Inc.. 
Mountain View. CA). 

Mouse MS4A proteins. Ten mouse MS4A proteins 
that shared 40-63% amino acid sequence identity with 
their potential human counterparts were identified 
(Fig. 2, Table 1). For comparison, the mouse and hu- 
man CD20 proteins are 74% identical in amino acid 
sequence (Tedder etaL, 1988a). A single partial cDNA 
that encoded the mouse homologue for HTm4 was iden- 
tified (MS4a3; Fig. 2). The predicted amino terminus of 
the proposed MS4a3 protein was 23 amino acids 
shorter than in the human protein, although their 
overlapping regions were 63% identical in amino acid 
sequence. In all cases, the transmembrane domains of 
the human and mouse MS4A proteins were the most 
well conserved regions. For example, the human 
MS4A8B protein was 78% identical in sequence to 
MS4a8B in the first three transmembrane domains 
and 68% identical in domain 4. The mouse MS4al0 
protein shared little sequence homology with any of the 
human MS4A proteins (Fig. 3). Therefore, additional 
MS4A genes are likely to be identified in humans and 
mice, including the mouse MS4A5 and human MS4alO 
counterparts. 

Expression of MS4A family members. Since CD20, 
FceRIj3, and HTm4 expressions are restricted to hema- 
topoietic tissues, MS4A gene transcription was as- 
sessed by PCR amplification of cDNA from 1 1 human 
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TABLE 2 

MS4A mRNA Expression by Human Lymphoblastoid Cell Lines 

MS4A family member" 

Cell line 1 2 3 4A 5 6A 7 8B 12 G3PDH 



+++ + - +++ 



Pre-B 

NALM-6 - - - + + + 
B cell 

BJAB + + + - - + + + 

DAUDI + + + - - + - - + + + + - + + + 

SB + + + - ++ + + + + + + + - ^ + + 
Tcell 

HSB-2 - - ~ + - - - - - + + + 

HUT-78 - - - + - - + - + + 

JURKAT - - - + - - - ~ + + + 

MOLT 15 - - - + - - ++ - - + + + 
Myelomonocyte 

HL60 - - + + + ++ - + + + + + + - - + + + 

U937 - - + + + + + + + + + + + - - + + + 
Erythroleukemia 

K562 - + + + + + + + + ~ + + + 



a Gene transcription was assessed by PCR amplification of cDNA generated from mRNA isolated from each cell type. Values represent the 
level of PCR product generated relative to the G3PDH control in three separate PCR: no specific PCR product detected: +. low levels of 
the appropriate band were detectable; ++ to + + + , appropriate bands of increasing intensity were readily visualized in all samples 
examined. Identical results were obtained using two different primer pairs for cDNA amplification 



hematopoietic cell lines. Like CD20, MS4A8B was ex- 
pressed only by B cell lines (Table 2). MS4A5 was 
expressed only by a promonocytic cell line. MS4A6A 
transcripts were expressed by B, myelomonocytic, and 
erythroleukemia cell lines. MS4A4A mRNA was ex- 
pressed by all cell lines examined, although the rela- 
tive mRNA levels varied significantly. MS4A 7 was ex- 
pressed in most, but not all, of the cell lines tested. 
MS4A12 transcripts were not detected in these cell 
lines. Thus, most MS4A family members are likely to 
be expressed in hematopoietic tissues. 

ESTs encoding MS4A transcripts were isolated from 
a variety of different cDNA libraries. MS4A4A ESTs 
were from aorta, brain, breast, heart, kidney, lung, 
ovary, pancreas, placenta, prostate, stomach, testis, 
and uterine tissues. MS4A5 ESTs were isolated only 
from testis. MS4A6A ESTs were from aorta, brain, 
central nervous system, colon, gall bladder, heart, kid- 
ney, lung, muscle, ovary, pancreas, placenta, prostate, 
skin, stomach, tonsil, uterus, and embryonic tissues. 
MS4A7 ESTs were from lung, kidney, lymphocytes, 
mammary gland, placenta, spleen, testis, thymus, and 
uterine tissues. MS4A8B ESTs were from brain, lung, 
uterus, and embryonic tissues. A single MS4A 12 EST 
was isolated from colon. This demonstrates differential 
MS4A gene transcription among lymphoid and non- 
lymphoid tissues. 

MS4a gene expression by mouse tissues was as- 
sessed by Northern analysis and PCR amplification of 
cDNAs (Table 3). In most cases assessed, Northern 
analysis failed to detect specific MS4a transcripts in 
tissues that revealed transcript production by PCR 
amplification (data not shown). These results suggest 
that MS4a transcripts are produced only by subpopu- 



lations of cells within each tissue such that transcript 
levels were often below the level of detection by North- 
ern analysis. Nonetheless, MS4a4B, MS4a4C, and 
MS4a6B transcripts were found at high levels in thy- 
mus, spleen, and peripheral lymph nodes, with less 
abundant levels in nonlymphoid tissues. MS4a6C was 
expressed only by thymus, spleen, PLN, and bone mar- 
row. MS4a4C, MS4a6D, and MS4a 7 were expressed in 
all tissues examined. MS4a8B transcripts were ex- 
pressed by spleen, peripheral lymph nodes, colon, liver, 
heart, lung, and bone marrow, MS4alO transcripts 
were found in thymus, kidney, colon, brain, and testis. 
In addition, CD20 (MS4al). FceRIfi (MS4a2), and 
MS4a3 expressions were primarily restricted to hema- 
topoietic tissues. MS4a3, MS4a4B, MS4a4C, MS4a6B, 
MS4a6Q MS4a6D } MS4a7. MS4a8B, and MS4aW 
were also expressed by various hematopoietic and lym- 
phoblastoid cell lines (data not shown). Therefore, most 
MS4a family members were expressed by hematopoi- 
etic cells. 

MS4A gene chromosome localization. Chromosome 
locations for the human MS4A4A, MS4A6A, MS4A7, 
and MS4A8B genes were identified in two distinct ho- 
mology searches. Regions of human MS4A4A (bp 
1286-1588), MS4A6A (bp 682-1106), MS4A7 (bp 502- 
941), MS4A 7 (bp 1015-1177), and MS4A8B (bp 1007- 
1350) (Fig. 1) were 98, 98, 97, 99, and 97% identical 
with human STS genomic sequence tag sites 
WM1578. SHGC-36634, WI-12101. WIAF-3856, and 
WI-14145. respectively (http://www.ncbi.nlm.nih.gov/ 
blast). These genomic sequence tag sites are located on 
human chromosome il at Genomic Database locus 
D11S1357-D11S913, which maps to 1 1 ql 2-ql 3 
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TABLE 3 

MS4a Gene Expression by Mouse Tissues 3 



mRNA expression 



MS4a 


Thymus 


Spleen 


PLN 


BM 


Liver 


Kidney 


Heart 


Colon 


Lung 


Brain 


Testes 


1 


+ 


+ + + 


+ + + 


+ 










+ 






2 


+ 


+ 


+ 


+ + + 










4 






o 








+ + + 










4 


4 




4B 


+ + + 


+ + + 


+ + + 


+ + 


+ 




+ 




4 






4C 


+ + + 


+ + + 


+ + + 


+ + + 


+ 


+ 


+ 


+ 


4 






4D 


+ 


+ 








+ 




+ + 


44 






6B 


+ + + 


+ + + 


+ + + 


+ + 


+ 




+ 


+ 


4 




4 4 


6C 


+ 


+ 


+ 


+ + 
















6D 


+ + + 


+ + + 


+ + + 




+ + + 


+ + + 


■f + + 


+ + + 


444 


4 




7 


4- -}- 


+ 4- 


+ + 










+ 4- 


4 






8B 




+ 


+ 


+ 


+ 






4- 4 








10 


+ 














+ 








G3PDH 


+ + + 


+ + + 


+ + + 


+ 4- + 


+ + + 


+ + + 


+ + + 


+ + + 


444 


-i- 4 4 





H Gene transcription was assessed by PCR amplification of cDNA generated from mRNA isolated from tissue samples. Values represent the 
level of PCR product generated relative to the G3PDH control as described for Table 2. PLN, peripheral lymph node; BM, bone marrow. 



(http://www.ncbi.nlm.nih.gov/genemap). These map- 
ping results were confirmed using the UniGene collec- 
tion at the National Center for Biotechnology Informa- 
tion (http://www.ncbi.nlm.nih.gov/Genemap98/) for 
expressed sequence tags identical to human MS4A4A, 
MS4A6A, MS4A7, and MS4A8B sequences. Thus, at 
least seven of the nine currently identified human 
MS4A genes are clustered. 

DISCUSSION 

Sixteen novel genes that encoded proteins sharing 
structural and sequence homologies with CD20 
[MS4A i), FceRIfi (MS4A2) , and HTw4 {MS 4 A 3) were 
identified (Fig. 3). Together, these proteins demon- 
strate the existence of a new gene family that we have 
termed MS4A. All the MS4A proteins contain four po- 
tential membrane-spanning domains with both N- and 
C-terminal cytoplasmic domains. Although the MS4A 
proteins are structurally similar to other membrane- 
spanning proteins with four transmembrane domains, 
their sequences are distinct from any previously known 
gene family. All MS4A proteins are similar in size, with 
MS4A5 (200 amino acids) being the smallest and hu- 
man CD20 the largest (Fig. 2). Moreover, 7 of the 9 
currently identified human MS4A genes (MS4A1. 
MS4A2^ MS4A3. MS4A4A, MS4A6A. MS4A7, and 
MS4A8B) were located at chromosome Ilql2-ql3.1. 
In addition, MS4A1, MS4A2, MS4a4B, MS4a4C. 
MS4a6B f and MS4a6C had predicted intron/exon 
boundaries at similar positions within their deduced 
proteins (Fig. 2). These features demonstrate that 
these genes evolved from a common precursor and 
share a close evolutionary history (Fig. 3). 

Among the human and mouse MS4A proteins, the 
most significant homologies between family members 
were found in the first three membrane-spanning do- 
mains (Fig. 2). Common amino acid motifs were readily 



visualized, such as KXLGAIQI, GYPXWG, and SGX- 
LSI, in the first and second transmembrane regions. In 
the third and fourth membrane-spanning domains, 
conserved amino acids were appropriately spaced such 
that one face of the potential a-helical protein is highly 
conserved in sequence (Fig. 2). The numbers of highly 
conserved residues in the membrane-spanning do- 
mains and the high degree of sequence conservation at 
the amino acid level between human and mouse family 
members suggest that the transmembrane regions 
play a critical role in MS4A protein function. 

Outside of the transmembrane regions, there is con- 
siderable diversity in size and sequence of the MS4A 
proteins (Fig. 3). The predicted second extracellular 
loops that form the bulk of the extracellular domains of 
these proteins showed little or no sequence homology 
between family members. This suggests that this do- 
main can tolerate many changes without affecting any 
possible functional role or that this domain has under- 
gone significant divergence to serve specific functions. 
The N- and C-terminal cytoplasmic domains were also 
divergent between family members except that they 
were Pro rich in most cases. The basis for this unusual 
feature is unknown, but suggests that these domains 
may display unique structural and/or functional char- 
acteristics. Surprisingly, CD20 was the most divergent 
family member (Fig. 3). In contrast, the three members 
of the MS4a4 and three members of the MS4a6 sub- 
families were highly conserved (70-84% of the amino 
acids were identical within the mouse MS4a4 and 
MS4a6 subfamily, with MS4a4B and MS4a4C the most 
similar in sequence). Since only single human counter- 
parts for mouse MS4a4 and MS4a6 subfamilies were 
found, additional MS4A genes are likely to exist. 
Therefore, this family of transmembrane proteins may 
be larger than is currently appreciated. 

Two common sequence polymorphisms were identi- 
fied in the MS4A6A gene (Fig. 1), with a third likely. 
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Polymorphisms in other MS4A family members are 
also likely to exist since there were consistent substi- 
tutions observed among multiple EST sequences. How- 
ever, we were unable to isolate independent cDNAs in 
these cases corresponding to each potential polymor- 
phism. Nonetheless, atopic allergies have been linked 
to human chromosome Ilql2-ql3 (reviewed in Kinet, 
1999), on which at least seven of the nine currently 
identified human MS4A genes reside. Specifically, sev- 
eral coding sequence polymorphisms in the MS4A2 
gene have been associated with allergic phenotypes. 
However, the involvement of other MS4A members in 
this process cannot be ruled out since MS4A polymor- 
phisms alone may not account for all features of 
asthma and allergies (Adra et aL, 1999; Furumoto et 
aL, 2000; Mohan et aL, 1999). The further character- 
ization of additional MS4A family members may pro- 
vide information for understanding immunologic ab- 
normalities that may lead to the development of atopy 
and other allergic responses. This is particularly rele- 
vant since many of the novel MS4A family members 
identified in the current study were preferentially ex- 
pressed by cells of the immune system. 

CD20, FcfiRI)3, and HTm4 expressions are restricted 
to discrete hematopoietic cell subpopulations and tis- 
sues. However, expression of the new MS4A family 
members was diverse among human and mouse tis- 
sues, although most family members were expressed 
by hematopoietic tissues and cell lines (Tables 2 and 3). 
For example, CD20 is B cell restricted, while MS4A4A 
is broadly expressed by most hematopoietic lineages 
and tissues (Table 2). Among hematopoietic cells, the 
MS4A genes were differentially regulated during lin- 
eage commitment and/or maturation. For example, 
MS4a6D was broadly expressed by lymphoid and non- 
lymphoid tissues, but numerous hematopoietic cell 
lines did not express these transcripts. Even among 
closely related subfamily members, such as MS4a6B, 
MS4a6C, and MS4a6D, there were considerable differ- 
ences in expression patterns among tissues and cell 
lines. Therefore, it is likely that the MS4 A proteins will 
have functions in multiple diverse cell types in addition 
to hematopoietic tissues. 

CD20 forms homo- or hetero-oligomeric complexes 
(Bubien et al, 1993; Tedder and Engel, 1994) and 
FceRI/3 forms tetrameric complexes with a and 7 
chains (Blank et aL, 1989). Other MS4A family mem- 
bers may be involved in the formation of similar mul- 
timolecular complexes. Given this, the tetrameric 
CD20 complex may be composed of CD20 and other 
members of the MS4A family since they are all of 
similar size (Table 1). This is supported by the finding 
that B lymphocytes and most other cell types express 
multiple MS4A family members (Tables 2 and 3). Thus, 
the MS4A family may resemble the GABA-A receptor 
gene family. GABA-A receptors are ligand-gated ion 
channels generated by the assembly of five individual 
subunits that are structurally similar to MS4A family 
members (Whiting, 1999). GABA-A receptor subunits 



have four transmembrane domains containing con- 
served amino acids with a large protein loop between 
the third and the fourth transmembrane domains. As- 
sembly of structurally similar, but distinct receptor 
subunits in differing stoichiometries generates 
GABA-A receptors with differing pharmacological 
properties (Sieghart et aL, 1999). MS4A family mem- 
bers may assemble to generate similar heterologous 
complexes. Nonetheless, the sequence homologies 
within the transmembrane domains of MS4A family 
members suggest that these proteins have functional 
properties similar to CD20 or FceRIjS subunits that 
function either directly as ligand-gated ion channels or 
as essential components of receptor complexes. Given 
that anti-CD20 monoclonal antibodies are effective im- 
munotherapeutics for human malignancies, other 
MS4A family members may also serve as targets for 
therapeutic intervention. 
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Asthma is a complex syndrome in which bronchial inflammation and 
smooth muscle hyperactivity lead to labile airflow obstruction. The 
commonest form of asthma is that due to atopy, which is an immune 
disorder where production of IgE to inhaled antigens leads to 
bronchial mucosal inflammation. The ultimate origins of asthma are 
interactive environmental and genetic factors. The genetics is acknowl- 
edged to be heterogeneous, and one chromosomal region of interest 
and controversy has been 1 3 q 1 3. To clarify the nature of the chromo- 
some llql3 effect in atopy and asthma, we conducted a genetic associ- 
ation study in subjects with marked atopic asthma and matched 
controls, which incorporated the study of 1 3 genetic variants over a 
distance of 10-12 cM and which took account of detailed immune and 
clinical phenotyping. Association with high IgE levels was limited to 
the interval flanked by D11S1335 and CD20 in a 0.8-Mb interval and 
was greatest for variants of FceRip and HTm4; these variants also 
associated with asthma (recurrent wheeze with labile airflow obstruc- 
tion and need for regular inhaler treatment). At the more telomeric 
marker, D11S480, variants associated with asthma, but not with high 
IgE levels. The data might support the possibility of multiple loci rele- 
vant to atopic asthma on chromosome llq!3. 
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Atopy is a common disorder characterized by in- 
creased general IgE responsiveness (1). Atopy is 



1 These two authors contributed equally. 

: Present address: Department of Pulmonology and Allergy. 

Medical University of Lodz. Poland. 

} Present address: Department of Cardiology. Hammersmith 
Hospital. London. UK. 



also an important cause of disorder in the skin 
(eczema), lung (asthma), and the nose (rhinitis), 
and family studies suggest variable combinations 
of organ-specific clinical syndromes in individuals 
within atopic families fl). 

A significant portion of atopic asthmatic families 
in Caucasian populations may be linked to chro- 
mosome llql3 through the maternal line (2). Data 
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from Japan (3) and Germany (4), using lod scores, 
and from the Netherlands (5), using affected sib- 
pair methods, have confirmed linkage in families 
with marked atopy irrespective of clinical symp- 
toms. The P subunit of the high affinity IgE recep- 
tor (FceRIP) gene has been mapped to 
chromosome 1 1 q 1 3. 1 (6, 7) and is a candidate gene 
for atopy because of its important role in initiating 
type I allergic reaction by mast cells and basophils. 
A recent large-scale population-based linkage 
study by sib-pair methodology affirms linkage of 
asthma with microsatellite repeats of FceRip, but 
not with other markers on 1 lq 13, in an Australian 
population (8). 

The FceRip gene is composed of seven exons 
and six introns, spanning approximately 1 1 kb (9). 
Six variants of this gene have been identified, in- 
cluding three coding and three non-coding vari- 
ants. Leul81Ile/Leul83Val variants of FceRIp 
have been identified in some British (10) and Aus- 
tralian (11) asthmatic families and showed signifi- 
cant association with atopic asthma (12, 13). 
Another coding variant, Gly237Glu, is also associ- 
ated with atopy and/or asthma in Australian, 
British (14) and Japanese (15) populations. Three 
non-coding variants, including (CA) n in the fifth 
intron (16) and two Rsal restriction fragment 
length polymorphisms (RFLPs) in the second in- 
tron (17) and in the 3' untranslated region of exon 
7 (16), also showed strong genetic association with 
heightened IgE responsiveness, suggesting the can- 
didacy of FceRip as an important cause for atopic 
asthma (18). 

Recently, the gene encoding the four-transmem- 
brane protein, HTm4, has been cloned and 
mapped to the same chromosome llql3 region by 
our group (19). The high structural and topological 



homology (24-60% at transmembranous portions) 
to CD20 and FceRip enables us to propose that 
HTm4, FceRip, and CD20 evolved from the same 
ancestral gene to form a family of four-transmem- 
brane proteins (19). Since this family is selectively 
expressed in hematopoietic linages, HTm4 is also a 
candidate locus for type I allergic disorder. 

Despite the accumulation of data in which 
FceRip associates with atopy, genetic linkage or 
association for atopic asthma has been found on 
llql3.1 with other markers FGF3 (20), D11S534 
(21), or D11S97 (22) (in relation to total serum IgE 
level), or D11S527 (21) or CCI6 (renamed from 
CC10) (23) in relation to asthma or bronchial 
hyper-responsiveness. Others have not found asso- 
ciation nor linkage for atopic asthmatic pheno- 
types with any genes or markers on the llql3.1 
region (24-31). This area of research, therefore, 
still remains controversial because of genetic het- 
erogeneity or differences in phenotype assignments 
among researchers. 

The aim of this research is, therefore, to clarify 
the localization of atopy and/or asthma using poly- 
morphic genes or microsatellite markers that span 
the entire llql3.1 region (Fig. 1) in a freshly 
collected random case-control population from 
Oxfordshire, where the original linkage to llq!3 
was found. 



Materials and methods 

Subjects selection 

One hundred and fifty pregnant women were 
drawn as controls from the Obstetrics Clinic in 
Oxford as a general population control (32). One 
hundred and twenty-five patients with atopic 
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Fig. I. Schematic presentation of genetic distance on the whole region of llql3. The left and right ends are centromenc and 
telomeric, respectively. Estimated distances are based on references (34, 38, 52). Symbols on the line are genes and under the line 
are microsatellite markers. The exact localization of Kappa. UCP2, D11S527. and D11S534 remains undermined, though EST of 
Kappa is localized telomeric to PYGM: C1NH; complement component 1 inhibitor, CC10 (CC16): clara cell 10. CHRM1: 
cholinergic receptor, muscarinic 1. CNTF; ciliary neurotrophic factor, FGF3; fibroblast growth factor 3. GIF; gastro intrinsic 
factor, GSTP1: glutathione S-transferase pi, HTm4: human transmembranous protein 4. IDDM: insulin-dependent diabetes 
mellitus, MENI; multiple endocrine neoplasia 1, PGA; pepsinogen A, PLCP3; phospholipase Cp3 ; PYGM; phosphorylase, 
glycogen; muscle, ROM1: retinal outer segment membrane protein 1, TCN1; transcobalamin 1. 
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Table 1. Association of variants in ten genes and three microsatellite markers on Hql3 with atopy or asthma phenotype 
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Distance from GIF (cM) a 
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1.32 
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3 Estimated distance on the basis of references. 

D Odds against control subjects. 

c Allele with highest odds. 

0 flsai RFLP in the second intron. 

'p<0.05. 
"p<0.01. 



asthma were collected from the Osier Chest Unit 
in Oxford; all were Caucasians (32). 

All the asthmatic subjects had specialist physi- 
cian-diagnosed asthma with 1) recurrent breath- 
lessness and chest tightness requiring on-going 
treatment, 2) physician documented wheeze, 3) 
documented labile airflow obstruction with vari- 
ability in serial peak expiratory flow rates greater 
than 30%. They showed a positive skin prick test 
of greater than 5 mm against any common anti- 
gens or a positive IgE ( > 0.7 IU/ml) in serum. 
Marked asthma was designated as chronic rather 
than episodic asthma and physicians' use of mul- 
tidrug therapy with steroid inhalers. The disease 
of atopy was considered by IgE serology (see be- 
low). There were no heavy smokers ( > 20 
cigarettes per day) in these subjects. 

Serological analysis 

Specific IgE against house dust mite (HDM ) and 
grass pollen mix (GX) was detected by the CAP 
system (Pharmacia. Uppsala, Sweden). The crite- 
ria for a positive titer of allergen-specific IgE 
were as used previously (2, 6). A high total IgE 
by CAP system was taken to be greater than 
published normal values for children or greater 
than 120 kU/I (mean + I SD) in adults (2, 6). 
Atopy, defined as IgE responsiveness, was diag- 
nosed as the presence of a high concentration of 
total serum IgE, a positive specific IgE titer 
against one or more aero-allergens, or a combi- 
nation of these two features. 



DNA analysis 

DNA samples were extracted using a commercial 
kit (IsoQuick. Microprobe Corporation, Garden 
Grove, USA). The polymorphic microsatellite re- 
peats for D11S480, D11SI335, and Dl IS 1833 were 
amplified from genomic DNA with rhodamine-la~ 
beled primers by polymerase chain reaction (PCR), 
and images were obtained by scanning in 6% poly- 
acrylamide gel with a fluorescent image analyzer, 
ABI prism 310 (Perkin Elmer, USA). RFLPs for 
the ten loci, UCP2, ROM1, CHRMl, GSTPl, 
Kappa, CC10/CC16, GIF, CD20, HTm4, and 
FceRIP were studied. Primers and PCR conditions 
were described elsewhere for GIF (17), CD20 (17). 
CC16 (23. 32), CHRMl (33). Kappa (34). ROM1 
(35), GSTPl (36). UCP2 (37), and FceRip at 
Giy237Glu (15) and in intron 2 (17) and 3' tail 
(16). Primers for new polymorphisms were 
5 CCG ATTGGGGTGCTGTGT and 
5'ATTCAGCTCTGGGGCACCA for HTm4. and 
5TGGGGACAATTCCAGAAGATT and 
5TCCTGTGGGAGAGCAAGATT for FctRI(3 
in the 5' promoter region. PCR in a mixture in- 
cluding 1.5 mmol/1 of magnesium chloride in 100 
ml was performed in a Perkin Elmer Cetus thermal 
cycler using a preliminary cycle (94°C for 5 min) 
and then 34 cycles for HTm4 (94°C for 30 s, 58°C 
for 60 s, and 72°C for 90 s) or for FceRip in the 5' 
promoter (94°C for 30 s, 60°C for 30 s, and 72°C 
for 30 s). Amplification products were digested 
with Taq\ for HTm4, Bbvl for UCP2, Seal for 
CD20. Dralll for GIF. Msel for ROM1. BsmM 
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for GSTP1, Apal for Kappa, and Sau96\ for CC16 
(Table 1). For FceRip, Msel for the 5' promoter 
region, Rsal for intron 2 and exon 7 untranslated 
tail (Table 1), and Xmnl for Gly237Glu were stud- 
ied. The polymorphic microsatellite repeats for 
D11S1335, D11S1883, and D11S480 were am- 
plified from genomic DNA with fluorescent for- 
ward-primers in a PCR, and images were obtained 
by scanning in 6% polyacrylamide gel with an 
automated sequencer (ABI 310). Sequencing was 
conducted with a big-dye system (ABI, UK) using 
downstream primers, and images were visualized in 
the commercialized POP-6 gel using the automated 
sequencer (ABI prism 310 Genetic Analyzer). 

Statistics 

Contingency table analysis, odds ratios (ORs), 95% 
confidence intervals (CI), and significance values 
were estimated by computerized exact methods 
(SPSS program). OR was calculated between high 
and low risk genotypes in genes and between high 
and low risk alleles in markers. 

Results 

The genotypes for newly identified RFLPs for 
HTm4 (Taql) and for FceRip (Msel) were confi- 
rmed by sequencing multiple clones of the PCR 
products from random samples (n = 5) (data not 

shown). 

The allele frequencies for the nine loci in the 
control subjects were fully consistent with Hardy- 
Weinberg equilibrium (data not shown). There was 
significant association between atopic asthma and 
variants of FceRip, HTm4 genes, and D11S480 
(Table 1). GIF and Dl IS 1335 centromeric to 
FceRip (38) showed no association with severe 
atopy (Table 1). No significant association was 
found with any form of atopy or asthma pheno- 
type at the other loci. 

There was a significant difference in genotype 
frequencies of FceRip variants between control 
and atopic asthma subjects (Table 1). This was 
strongest with the intron 2 variants (OR = 2.98, 
95%CI 1.88-4.97. p<0,01). This polymorphism 
was significantly associated with any combination 
of atopy phenotypes and showed a highest OR of 
4.77 (95%CI 2.45-10.77, p<0.01) with a marked 
atopy phenotype (high total serum IgE > 300 IV I 
ml and allergen-specific IgE for HDM and GX 
both > 3.5 IU/ml of serum). However, the coding 
variant, Gly237Glu. was not associated with any 
combination of atopy or asthma phenotype. As 
shown in Table 1. a genetic association was also 
found between the variant of HTm4 and atopic 



asthma (OR = 2.59. 95%CI 1.21-4.88. p<0.01): 
this variant was associated with marked atopy 
(OR = 3.31, 95%CI 1.14-13.99, p<0.05), though 
the highest odds of 6.00 (95%CI 1.45-12.33, p< 
0.05) was found with HDM > 1.5 IU/ml. In terms 
of perennial, chronic asthma, the Rsal RFLP in 
intron 2 of the FceRip gene, as well as Taql RFLP 
in intron 3 of HTm4 gene showed a slight increase 
of ORs; 3.37 (95%CI 1.98-8.97, p < 0.01) or 3.13 
(95%CI 1.41-7.88. p<0.01), respectively (Table 
1). 

There was a strong linkage disequilibrium be- 
tween the intron 2 Rsal RFLP and the 3' tail Rsal 
polymorphism (t = A/SE = 2.88. p< 0.001). but 
this was weaker with the Msel polymorphism in 
the 5' promoter region (t = A/SE = 2.09. p = 0.03) 
in FcsRIfi. The variant of HTm4 gene was in 
strong linkage disequilibrium (t - A/SE = 2.44. 
p< 0.001) with the intron 2 variant of FceRip 
gene. 

One of the alleles, 201 bp of D11S480, was 
associated with asthma (OR = 2.07, 95%CI 1.00- 
4.98, p = 0.05); OR increased up to 2.77 (95%CI 
1.18-7.04, p<0.05) in association with perennial, 
chronic asthma. However, it did not associate with 
a marked atopy phenotype (OR = 1.23 95%CI 
0.68-5.98, x 2 = P>0.1, Table 1) nor any other 
form of atopy phenotype. 

Discussion 

In this study, we found two patterns of association 
with atopic asthma on the llql3.1 region; in one. 
FceRip and HTm4 are associated with high serum 
IgE levels, and in the other, D11S480 is associated 
with asthma per se. 

Linkage has been reported between atopic 
asthma and the FceRip locus in British (1. 2. 
39-41), Australian (42), German (4), and Japanese 
(3, 43) families. A large number of affected sibs 
from the Australian general population also 
showed strong linkage between FceRip and 
asthma, even in the absence of atopy (8). This 
raises the question whether this linkage is through 
bronchial hyper-responsiveness or through general 
IgE responsiveness. In this study, we found strong 
genetic association between atopic asthma and one 
of the non-coding variants of the FceRip gene. 
Rsal RFLP in intron 2. The OR for marked 
asthma (chronic and perennial) increased to 3.37. 
whereas the greatest OR was found for marked 
atopy phenotype. In German families (30). severe 
atopy was linked to FceRip. though lod scores 
were negative in tests with mild atopy phenotypes. 
Multivariate analysis demonstrated that the contri- 
bution of FceRip on atopic asthma was stronger 
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among patients with higher total IgE (Hagihara et 
al., unpublished data). Furthermore, three coding 
variants, Leul81Ile, Leul83Val, and Gly237Glu, 
showed strong association with atopy, especially 
with high total serum IgE levels in distinct ethnic 
groups (10-17), while intrinsic asthma did not 
(17), suggesting that the linkage or association of 
atopic asthma with FceRIp is mainly through IgE 
responsiveness. This is supported by the associa- 
tion or linkage between FceRIp and another clini- 
cal atopic disorder, eczema, in Germany (4), 
Britain (44), and Japan (Ohta et al., submitted), 
since IgE responsiveness is a common feature be- 
tween atopic asthma and eczema. 

A growing body of evidence supports the candi- 
dacy of FceRIp as an atopy locus. However, the 
functional action of FceRIp variants in promoting 
atopy remains undefined. A direct functional link 
of the coding variant, Leu 181, in the FceRIp gene 
was hypothesized. However, neither significant up- 
regulation of histamine release from basophils (45) 
nor of phosphorylation of the receptors has been 
found (J-P Kinet. personal communication). More 
recently, the function of human FceRip has been 
described more precisely; it amplifies the intensity 
of cell activation signals through the FceRIy chain 
with a gain of 5-7-fold when it couples with FceRI 
ay receptors in vitro (46) and in vivo (47). This 
'amplifier' function is of particular interest, due to 
several non-coding polymorphisms (16-18), which 
might relate to expression levels of this gene. Fur- 
ther molecular biochemical studies are needed to 
test whether these polymorphisms are responsible 
for quantitative change in FceRip's amplifying sig- 
nals in basophils and mast cells among atopic 
subjects. 

Another explanation is that an unrecognized 
atopy gene sits close to FceRIp and its variants 
associate tightly with those of FceRip. A new 
member of the CD20/FceRip family, the HTm4 
gene, has been cloned (19). and recent fine map- 
ping enables us to determine the distance between 
FceRIp and HTm4 to less than 70 kb (Adra et al., 
submitted). HTm4 spans about 13 kb and consists 
of seven exons, a structure quite similar to that of 
FceRIp (Adra et al., submitted). A Taql RFLP in 
the third intron of the gene was identified and 
showed a strong association with atopic asthma. 
Since this variant showed similar ORs for marked 
asthma, as well as marked atopy phenotypes, to 
those for intron 2 of the FceRip gene, HTm4 
might be regarded as a candidate locus for atopy 
on llql3.1. Another possibility is a locus between 
the two genes, FceRIp and HTm4, at which vari- 
ants confer the atopy phenotype. Further physical 
and functional mapping is under way in our 
laboratories. 



Chromc^^e llql3 and atopic asthma 

The second locus for asthma on llql3.1 is be- 
tween D11S480 and Dl IS 1883, approximately 5 
cM telomeric to FceRIp (Fig. 1). We have previ- 
ously showed strong linkage with clinical symp- 
toms of asthma at Dl 1S480 in 40 British asthmatic 
families, but not with atopy phenotype (Dubowitz 
et al., unpublished data). In this study, one of the 
alleles of D11S480 was associated with marked 
asthma, but not with any kind of combination of 
atopy phenotype. These findings suggest that a 
clinical asthma locus may be localized in close 
relation to D11S480, independently of the FceRIp/ 
HTm4 locus on llql3.1. 

The autosomal recessive disorder, Bardet-Biedl 
syndrome (BBS), which is characterized by retinal 
degeneration, Polydactyly, obesity, mental retarda- 
tion, hypogenitalism, renal dysplasia, and short 
stature, is heterogeneous with at least four loci 
(BBS 1-4) to date (48). Almost half of Caucasian 
families have been linked to llql3.1 (BBS1), and 
the highest lods were found at D11S1883 with no 
recombination. Interestingly, a quarter of patients 
linked to BBS1 showed atopic asthma (48). This 
indicates that BBS1 might be in linkage disequi- 
librium with an atopic asthma locus between 
D11S480 and PYGM (Fig. 1). More recently, 
strong genetic association was found between 
childhood asthma and CC16 (23), 1 cM cen- 
tromeric to D11S480 on llql3.1. Biallelic and 
microsatellite variants have been identified in this 
gene. However, no association was found between 
these variants and asthma phenotype in British 
(32) and Japanese (49) populations. Another can- 
didate is CHRM1 (33), a muscarinic receptor on 
airways. However, no association was found in our 
population. Since our patients show no association 
with D11S1883. data on this study, on BBSL and 
on CC16 suggest that the locus for asthma might 
be localized between D11S480 and D11S1883 in a 
300-kb interval. 

A third atopic asthma locus on 1 lql3.1 has been 
reported telomeric to FGF3. more than 10-12 cM 
away from FceRIp. In 131 British families, an 
allelic association was found between Dl 1S527 and 
bronchial hyper-responsiveness (21). Furthermore, 
at Dl 1S534. one of the alleles associated with total 
IgE levels (21), We therefore examined this area 
with two candidate genes, GSTP1 (36) and UCP2 
(39); in the former, variant Ilel04Val has been 
identified and is of particular interest in relation to 
alcohol- or chemical-induced asthma; the latter is 
specifically expressed in macrophages. However, 
no association was found between any kind of 
atopy or asthma phenotypes and variants of these 
genes. Australian families linked to FceRip did not 
favor linkage to loci 8-9 cM either side of this 
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gene (49, 50). Also, two genome-wide searches (42, 
51) failed to find linkage with markers in this area, 
including FGF3. Since the associations were only 
found with microsatellite repeats (FGF3, Dl 1S527, 
and D11S534) in this region and are extremes in a 
huge number of tests, these associations might be 
due to a type I error, despite testing in both para- 
metric and non-parametric approaches (21). 

In conclusion, there might be two loci for atopic 
asthma on llql3.1. One is for atopy (high IgE 
levels) and lies between Dl IS 1335 and CD20 in an 
interval of 0.8 Mb, the other is for asthma per se 
(wheeze and labile airway obstruction) and lies 
close to Dl 1S480. These findings might explain the 
diversity among the reports, with or without asso- 
ciation and/or linkage to different loci on 1 Icq 1 3. 
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